Commit Graph

10 Commits

Author SHA1 Message Date
Charles Packer
a9feec245f feat: Add common + custom settings files for completion endpoints (#631) 2023-12-18 15:22:24 +04:00
Charles Packer
070c0123c6 migrate to using completions endpoint by default (#628)
* migrate to using completions endpoint by default

* added note about version to docs
2023-12-15 12:29:52 -08:00
Charles Packer
8cc1ed0f59 updated local APIs to return usage info (#585)
* updated APIs to return usage info

* tested all endpoints
2023-12-13 21:11:20 -08:00
Sarah Wooders
ec2bda4966 Refactor config + determine LLM via config.model_endpoint_type (#422)
* mark depricated API section

* CLI bug fixes for azure

* check azure before running

* Update README.md

* Update README.md

* bug fix with persona loading

* remove print

* make errors for cli flags more clear

* format

* fix imports

* fix imports

* add prints

* update lock

* update config fields

* cleanup config loading

* commit

* remove asserts

* refactor configure

* put into different functions

* add embedding default

* pass in config

* fixes

* allow overriding openai embedding endpoint

* black

* trying to patch tests (some circular import errors)

* update flags and docs

* patched support for local llms using endpoint and endpoint type passed via configs, not env vars

* missing files

* fix naming

* fix import

* fix two runtime errors

* patch ollama typo, move ollama model question pre-wrapper, modify question phrasing to include link to readthedocs, also have a default ollama model that has a tag included

* disable debug messages

* made error message for failed load more informative

* don't print dynamic linking function warning unless --debug

* updated tests to work with new cli workflow (disabled openai config test for now)

* added skips for tests when vars are missing

* update bad arg

* revise test to soft pass on empty string too

* don't run configure twice

* extend timeout (try to pass against nltk download)

* update defaults

* typo with endpoint type default

* patch runtime errors for when model is None

* catching another case of 'x in model' when model is None (preemptively)

* allow overrides to local llm related config params

* made model wrapper selection from a list vs raw input

* update test for select instead of input

* Fixed bug in endpoint when using local->openai selection, also added validation loop to manual endpoint entry

* updated error messages to be more informative with links to readthedocs

* add back gpt3.5-turbo

---------

Co-authored-by: cpacker <packercharles@gmail.com>
2023-11-14 15:58:19 -08:00
Charles Packer
7f950b05e8 Patch local LLMs with context_window (#416)
* patch

* patch ollama

* patch lmstudio

* patch kobold
2023-11-10 12:06:41 -08:00
Charles Packer
dab47001a9 Fix max tokens constant (#374)
* stripped LLM_MAX_TOKENS constant, instead it's a dictionary, and context_window is set via the config (defaults to 8k)

* pass context window in the calls to local llm APIs

* safety check

* remove dead imports

* context_length -> context_window

* add default for agent.load

* in configure, ask for the model context window if not specified via dictionary

* fix default, also make message about OPENAI_API_BASE missing more informative

* make openai default embedding if openai is default llm

* make openai on top of list

* typo

* also make local the default for embeddings if you're using localllm instead of the locallm endpoint

* provide --context_window flag to memgpt run

* fix runtime error

* stray comments

* stray comment
2023-11-09 17:59:03 -08:00
Charles Packer
e90c00ad63 Add grammar-based sampling (for webui, llamacpp, and koboldcpp) (#293)
* add llamacpp server support

* use gbnf loader

* cleanup and warning about grammar when not using llama.cpp

* added memgpt-specific grammar file

* add grammar support to webui api calls

* black

* typo

* add koboldcpp support

* no more defaulting to webui, should error out instead

* fix grammar

* patch kobold (testing, now working) + cleanup log messages

Co-Authored-By: Drake-AI <drake-ai@users.noreply.github.com>
2023-11-04 12:02:44 -07:00
raisindetre
12ca6e98af Changes to lmstudio to fix JSON decode error (#208)
* Changes to lmstudio to fix JSON decode error

* black formatting

* properly handle context overflow error (propogate exception up the stack with recognizable error message) + add backwards compat option to use completions endpoint

* set max tokens to 8k, comment out the overflow policy (use memgpt's overflow policy)

* 8k not 3k

---------

Co-authored-by: Matt Poff <mattpoff@Matts-MacBook-Pro-2.local>
Co-authored-by: cpacker <packercharles@gmail.com>
2023-10-30 23:08:00 -07:00
Charles Packer
139f06f091 use urljoin instead of path.join (#173) 2023-10-28 20:41:44 -07:00
Charles Packer
78cb676cd2 LM Studio inference server support (#167)
* updated airo wrapper to catch specific case where extra closing } is missing

* added lmstudio support
2023-10-28 18:30:35 -07:00