* added new wrapper option to turn system style messages into system role messages
* added multirole wrapper
* unrelated issue with cli print due to previous PR (not showing core memory edits)
* added new 'hint' wrappers that inject hints into the pre-prefix
* modified basic search functions with extra input sanitization
* updated first message prefix
* add folder generation
* disable default temp until more testing is done
* apply embedding payload patch to search, add input checking for better runtime error messages
* streamlined memory pressure warning now that heartbeats get forced
* added basic heartbeat override
* tested and working on lmstudio (patched typo + patched new bug emerging in latest lmstudio build
* added lmstudio patch to chatml wrapper
* update the system messages to be informative about the source
* updated string constants after some tuning
* Revert "Revert "nonfunctional 404 quickstart command w/ some other typo corrections""
This reverts commit 5dbdf31f1ce939843ff97e649554d8bc0556a834.
* Revert "Revert "added example config file""
This reverts commit 72a58f6de31f3ff71847bbaf083a91182469f9af.
* tested and working
* added and tested openai quickstart, added fallback if internet 404's to pull from local copy
* typo
* updated openai key input message to include html link
* renamed --type to --backend, added --latest flag which fetches from online default is to pull from local file
* fixed links
* don't add anything except for assistant messages to the global autogen message historoy
* properly format autogen messages when using local llms (allow naming to get passed through to the prompt formatter)
* add extra handling of autogen's name field in step()
* comments
* init vllm (not tested), uses POST API not openai wrapper
* add to cli config list
* working vllm endpoint
* add model configuration for vllm
---------
Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
* mark depricated API section
* CLI bug fixes for azure
* check azure before running
* Update README.md
* Update README.md
* bug fix with persona loading
* remove print
* make errors for cli flags more clear
* format
* fix imports
* fix imports
* add prints
* update lock
* update config fields
* cleanup config loading
* commit
* remove asserts
* refactor configure
* put into different functions
* add embedding default
* pass in config
* fixes
* allow overriding openai embedding endpoint
* black
* trying to patch tests (some circular import errors)
* update flags and docs
* patched support for local llms using endpoint and endpoint type passed via configs, not env vars
* missing files
* fix naming
* fix import
* fix two runtime errors
* patch ollama typo, move ollama model question pre-wrapper, modify question phrasing to include link to readthedocs, also have a default ollama model that has a tag included
* disable debug messages
* made error message for failed load more informative
* don't print dynamic linking function warning unless --debug
* updated tests to work with new cli workflow (disabled openai config test for now)
* added skips for tests when vars are missing
* update bad arg
* revise test to soft pass on empty string too
* don't run configure twice
* extend timeout (try to pass against nltk download)
* update defaults
* typo with endpoint type default
* patch runtime errors for when model is None
* catching another case of 'x in model' when model is None (preemptively)
* allow overrides to local llm related config params
* made model wrapper selection from a list vs raw input
* update test for select instead of input
* Fixed bug in endpoint when using local->openai selection, also added validation loop to manual endpoint entry
* updated error messages to be more informative with links to readthedocs
* add back gpt3.5-turbo
---------
Co-authored-by: cpacker <packercharles@gmail.com>
* stripped LLM_MAX_TOKENS constant, instead it's a dictionary, and context_window is set via the config (defaults to 8k)
* pass context window in the calls to local llm APIs
* safety check
* remove dead imports
* context_length -> context_window
* add default for agent.load
* in configure, ask for the model context window if not specified via dictionary
* fix default, also make message about OPENAI_API_BASE missing more informative
* make openai default embedding if openai is default llm
* make openai on top of list
* typo
* also make local the default for embeddings if you're using localllm instead of the locallm endpoint
* provide --context_window flag to memgpt run
* fix runtime error
* stray comments
* stray comment
* I added some json repairs that helped me with malformed messages
There are two of them: The first will remove hard line feeds that appear
in the message part because the model added those instead of escaped
line feeds. This happens a lot in my experiments and that actually fixes
them.
The second one is less tested and should handle the case that the model
answers with multiple blocks of strings in quotes or even uses unescaped
quotes. It should grab everything betwenn the message: " and the ending
curly braces, escape them and makes it propper json that way.
Disclaimer: Both function were written with the help of ChatGPT-4 (I
can't write much Python). I think the first one is quite solid but doubt
that the second one is fully working. Maybe somebody with more Python
skills than me (or with more time) has a better idea for this type of
malformed replies.
* Moved the repair output behind the debug flag and removed the "clean" one
* Added even more fixes (out of what I just encountered while testing)
It seems that cut of json can be corrected and sometimes the model is to
lazy to add not just one curly brace but two. I think it does not "cost"
a lot to try them all out. But the expeptions get massive that way :)
* black
* for the final hail mary with extract_first_json, might as well add a double end bracket instead of single
---------
Co-authored-by: cpacker <packercharles@gmail.com>
* add llamacpp server support
* use gbnf loader
* cleanup and warning about grammar when not using llama.cpp
* added memgpt-specific grammar file
* add grammar support to webui api calls
* black
* typo
* add koboldcpp support
* no more defaulting to webui, should error out instead
* fix grammar
* patch kobold (testing, now working) + cleanup log messages
Co-Authored-By: Drake-AI <drake-ai@users.noreply.github.com>
* trying to patch summarize when running with local llms
* moved token magic numbers to constants, made special localllm exception class (TODO catch these for retry), fix summarize bug where it exits early if empty list
* missing file
* raise an exception on no-op summary
* changed summarization logic to walk forwards in list until fraction of tokens in buffer is reached
* added same diff to sync agent
* reverted default max tokens to 8k, cleanup + more error wrapping for better error messages that get caught on retry
* patch for web UI context limit error propogation, using best guess for what the web UI error message is
* add webui token length exception
* remove print
* make no wrapper warning only pop up once
* cleanup
* Add errors to other wrappers
---------
Co-authored-by: Vivian Fang <hi@vivi.sh>