General finetuning guide: https://rentry.org/llm-training

Will keep track of open-source LM releases here.

Good way to run open source models:

An idea for how to make model inference cheaper and faster:

  • wouldn’t that involve caching each prompt + storing activation state for each prompt or most frequent prompts? I imagine it’s rare for two prompts to be the exact same unless they cache “similar enough” prompts or something, which would take up a lot of storage potentially
  • this is a cool idea for speeding up open source models though:
    • embed user inputs
    • cluster to figure out most frequent inputs and cache a model activation for one of them
    • on new input, check if you can find a close enough activation

Mistral funding memo: https://drive.google.com/file/d/1gquqRqiT-2Be85p_5w0izGQGgHvVzncQ/view?usp=drivesdk was wondering what the strategy was for startups raising 100m+ (eg mistral adept) who have only released FOSS models, no product