Commit Graph

115 Commits

Author SHA1 Message Date
ifsheldon
dd2f4fc873 fix: Turn off all ensure_ascii of json.dumps (#800) 2024-01-11 23:54:35 -08:00
Jim Lloyd
9c06056443 fix string & ws rules in json_func_calls...gbnf (#754) 2024-01-02 10:33:09 -08:00
Charles Packer
0b9fdcf46c fix: added new json test case + added fix for it (also refactored clean json func) (#739) 2023-12-28 23:01:45 -08:00
Charles Packer
653693c398 feat: chatml-noforce-roles wrapper + cli fix (#738)
* added new wrapper option to turn system style messages into system role messages

* added multirole wrapper

* unrelated issue with cli print due to previous PR (not showing core memory edits)
2023-12-28 22:12:52 -08:00
Charles Packer
723da684b1 don't insert request heartbeat into pause heartbeat (#727) 2023-12-28 12:11:31 -08:00
Vlad Cuciureanu
5358c839ee fix: Typo in info log message and docs (#730) 2023-12-28 12:11:09 -08:00
Charles Packer
43ed0ff714 feat: added new 'hint' wrappers that inject hints into the pre-prefix (#707)
* added new 'hint' wrappers that inject hints into the pre-prefix

* modified basic search functions with extra input sanitization

* updated first message prefix
2023-12-25 11:29:42 -08:00
Charles Packer
3ec8bb1465 fix: misc fixes (#700)
* add folder generation

* disable default temp until more testing is done

* apply embedding payload patch to search, add input checking for better runtime error messages

* streamlined memory pressure warning now that heartbeats get forced
2023-12-25 01:29:13 -08:00
Charles Packer
20f5231aff feat: added basic heartbeat override heuristics (#621)
* added basic heartbeat override

* tested and working on lmstudio (patched typo + patched new bug emerging in latest lmstudio build

* added lmstudio patch to chatml wrapper

* update the system messages to be informative about the source

* updated string constants after some tuning
2023-12-24 23:46:00 -08:00
Charles Packer
6e3d9e143e set default temp to 0.8 (#696) 2023-12-24 23:36:22 -08:00
Charles Packer
7f20b63553 added docs page, updated messages about param loading to show params loaded (#688) 2023-12-23 01:48:13 -08:00
Charles Packer
4f23934e04 feat: Add new wrapper defaults (#656) 2023-12-21 17:05:38 +04:00
Charles Packer
3bc9ed01f7 patch bug with raw mode (#663) 2023-12-20 17:25:52 -08:00
Charles Packer
f532ffc41f feat: Add memgpt quickstart command (#641)
* Revert "Revert "nonfunctional 404 quickstart command w/ some other typo corrections""

This reverts commit 5dbdf31f1ce939843ff97e649554d8bc0556a834.

* Revert "Revert "added example config file""

This reverts commit 72a58f6de31f3ff71847bbaf083a91182469f9af.

* tested and working

* added and tested openai quickstart, added fallback if internet 404's to pull from local copy

* typo

* updated openai key input message to include html link

* renamed --type to --backend, added --latest flag which fetches from online default is to pull from local file

* fixed links
2023-12-20 00:00:40 -08:00
Charles Packer
f8b99b562f feat: Migrate docs (#646)
* updated docs for readme

* Update index.md

* Update index.md

* added header

* broken link

* sync heading sizes

* fix various broken rel links

* Update index.md

* added webp

* Update index.md

* strip mkdocs/rtk files

* replaced readthedocs references with readme
2023-12-18 20:29:24 -08:00
Charles Packer
a9feec245f feat: Add common + custom settings files for completion endpoints (#631) 2023-12-18 15:22:24 +04:00
cpacker
7f2edd8dd7 Revert "nonfunctional 404 quickstart command w/ some other typo corrections"
This reverts commit 22119cfb037c7d9379653006eafa03042eafcbe8.
2023-12-18 00:48:47 -08:00
cpacker
9c8ed92ad7 nonfunctional 404 quickstart command w/ some other typo corrections 2023-12-18 00:45:02 -08:00
Charles Packer
070c0123c6 migrate to using completions endpoint by default (#628)
* migrate to using completions endpoint by default

* added note about version to docs
2023-12-15 12:29:52 -08:00
Charles Packer
490c0ccd4a patch bug where None.copy() throws runtime error (#617) 2023-12-14 12:52:53 -08:00
Charles Packer
8cc1ed0f59 updated local APIs to return usage info (#585)
* updated APIs to return usage info

* tested all endpoints
2023-12-13 21:11:20 -08:00
Charles Packer
0d8b95e2a7 AutoGen misc fixes (#603)
* don't add anything except for assistant messages to the global autogen message historoy

* properly format autogen messages when using local llms (allow naming to get passed through to the prompt formatter)

* add extra handling of autogen's name field in step()

* comments
2023-12-10 20:52:21 -08:00
Charles Packer
b65980e2b3 add back dotdict for backcompat (#572) 2023-12-04 23:02:22 -08:00
Charles Packer
df999de4c1 use a consistent warning prefix across codebase (#569) 2023-12-04 11:38:51 -08:00
Sarah Wooders
f7b4213ef8 Fix vLLM endpoint to have correct suffix (#548)
* minor fix

* fix vllm endpoint

* fix docs
2023-12-01 14:11:05 -08:00
Charles Packer
b741b601fb Update AutoGen documentation and notebook example (#540)
* Update AutoGen documentation

* Update webui.md

* Update webui.md

* Update lmstudio.md

* Update lmstudio.md

* Update mkdocs.yml

* Update README.md

* Update README.md

* Update README.md

* Update autogen.md

* Update local_llm.md

* Update local_llm.md

* Update autogen.md

* Update autogen.md

* Update autogen.md

* refreshed the autogen examples + notebook (notebook is untested)

* unrelated patch of typo I noticed

* poetry remove pyautogen, then manually removed autogen extra in .toml

* add pdf dependency

---------

Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
2023-11-30 17:45:04 -08:00
Sarah Wooders
2857ae1c81 Remove usage of BACKEND_TYPE (#539) 2023-11-30 14:18:25 -08:00
Charles Packer
5e7676e133 Remove openai package and migrate to requests (#534) 2023-11-30 13:00:13 -08:00
Charles Packer
a367ee4072 patched a bug where outputs of a regex extraction weren't getting cast back to string, causing an issue when the dict was then passed to json.dumps() (#533) 2023-11-29 12:57:05 -08:00
Sarah Wooders
b05b09439f Add user field for vLLM endpoint (#531) 2023-11-29 12:30:42 -08:00
Sarah Wooders
ed356dd82c Add support for HuggingFace Text Embedding Inference endpoint for embeddings (#524) 2023-11-27 16:28:49 -08:00
Charles Packer
f92d8dfc8b add a longer prefix that to the default wrapper (#510)
* add a longer prefix that to the default wrapper (not just opening brace, but up to 'function: ' part since that is always present)

* drop print
2023-11-26 19:59:49 -08:00
Charles Packer
2121130a88 add new manual json parser meant to catch send_message calls with trailing bad extra chars (#509)
* add new manual json parser meant to catch send_message calls with stray trailing chars, patch json error passing

* typo
2023-11-25 16:30:12 -08:00
Charles Packer
de0ccea181 vLLM support (#492)
* init vllm (not tested), uses POST API not openai wrapper

* add to cli config list

* working vllm endpoint

* add model configuration for vllm

---------

Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
2023-11-21 15:16:03 -08:00
Charles Packer
8a7a64c7f9 patch web UI (#484)
* patch web UI

* set truncation_length
2023-11-19 14:56:10 -08:00
Charles Packer
576795ffdb move webui to new openai completions endpoint, but also provide existing functionality via webui-legacy backend (#468) 2023-11-15 23:08:30 -08:00
Sarah Wooders
ec2bda4966 Refactor config + determine LLM via config.model_endpoint_type (#422)
* mark depricated API section

* CLI bug fixes for azure

* check azure before running

* Update README.md

* Update README.md

* bug fix with persona loading

* remove print

* make errors for cli flags more clear

* format

* fix imports

* fix imports

* add prints

* update lock

* update config fields

* cleanup config loading

* commit

* remove asserts

* refactor configure

* put into different functions

* add embedding default

* pass in config

* fixes

* allow overriding openai embedding endpoint

* black

* trying to patch tests (some circular import errors)

* update flags and docs

* patched support for local llms using endpoint and endpoint type passed via configs, not env vars

* missing files

* fix naming

* fix import

* fix two runtime errors

* patch ollama typo, move ollama model question pre-wrapper, modify question phrasing to include link to readthedocs, also have a default ollama model that has a tag included

* disable debug messages

* made error message for failed load more informative

* don't print dynamic linking function warning unless --debug

* updated tests to work with new cli workflow (disabled openai config test for now)

* added skips for tests when vars are missing

* update bad arg

* revise test to soft pass on empty string too

* don't run configure twice

* extend timeout (try to pass against nltk download)

* update defaults

* typo with endpoint type default

* patch runtime errors for when model is None

* catching another case of 'x in model' when model is None (preemptively)

* allow overrides to local llm related config params

* made model wrapper selection from a list vs raw input

* update test for select instead of input

* Fixed bug in endpoint when using local->openai selection, also added validation loop to manual endpoint entry

* updated error messages to be more informative with links to readthedocs

* add back gpt3.5-turbo

---------

Co-authored-by: cpacker <packercharles@gmail.com>
2023-11-14 15:58:19 -08:00
Charles Packer
7f950b05e8 Patch local LLMs with context_window (#416)
* patch

* patch ollama

* patch lmstudio

* patch kobold
2023-11-10 12:06:41 -08:00
Charles Packer
dab47001a9 Fix max tokens constant (#374)
* stripped LLM_MAX_TOKENS constant, instead it's a dictionary, and context_window is set via the config (defaults to 8k)

* pass context window in the calls to local llm APIs

* safety check

* remove dead imports

* context_length -> context_window

* add default for agent.load

* in configure, ask for the model context window if not specified via dictionary

* fix default, also make message about OPENAI_API_BASE missing more informative

* make openai default embedding if openai is default llm

* make openai on top of list

* typo

* also make local the default for embeddings if you're using localllm instead of the locallm endpoint

* provide --context_window flag to memgpt run

* fix runtime error

* stray comments

* stray comment
2023-11-09 17:59:03 -08:00
Hans Raaf
12f9bf29fd I added some json repairs that helped me with malformed messages (#341)
* I added some json repairs that helped me with malformed messages

There are two of them: The first will remove hard line feeds that appear
in the message part because the model added those instead of escaped
line feeds. This happens a lot in my experiments and that actually fixes
them.

The second one is less tested and should handle the case that the model
answers with multiple blocks of strings in quotes or even uses unescaped
quotes. It should grab everything betwenn the message: " and the ending
curly braces, escape them and makes it propper json that way.

Disclaimer: Both function were written with the help of ChatGPT-4 (I
can't write much Python). I think the first one is quite solid but doubt
that the second one is fully working. Maybe somebody with more Python
skills than me (or with more time) has a better idea for this type of
malformed replies.

* Moved the repair output behind the debug flag and removed the "clean" one

* Added even more fixes (out of what I just encountered while testing)

It seems that cut of json can be corrected and sometimes the model is to
lazy to add not just one curly brace but two. I think it does not "cost"
a lot to try them all out. But the expeptions get massive that way :)

* black

* for the final hail mary with extract_first_json, might as well add a double end bracket instead of single

---------

Co-authored-by: cpacker <packercharles@gmail.com>
2023-11-09 17:05:42 -08:00
Mo Nuaimat
8adef204e6 Fixing some dict value checking for function_call (#249) 2023-11-06 15:44:51 -08:00
Charles Packer
fe2d8b2b2f add ollama support (#314)
* untested

* patch

* updated

* clarified using tags in docs

* tested ollama, working

* fixed template issue by creating dummy template, also added missing context length indicator

* moved count_tokens to utils.py

* clean
2023-11-06 15:11:22 -08:00
Charles Packer
5ac8635446 cleanup #326 (#333) 2023-11-06 12:57:19 -08:00
borewik
dbbb3fc14b Update chat_completion_proxy.py (#326)
grammar_name Has to be defined, if not there's an issue with line 92
2023-11-06 12:53:17 -08:00
Charles Packer
e90c00ad63 Add grammar-based sampling (for webui, llamacpp, and koboldcpp) (#293)
* add llamacpp server support

* use gbnf loader

* cleanup and warning about grammar when not using llama.cpp

* added memgpt-specific grammar file

* add grammar support to webui api calls

* black

* typo

* add koboldcpp support

* no more defaulting to webui, should error out instead

* fix grammar

* patch kobold (testing, now working) + cleanup log messages

Co-Authored-By: Drake-AI <drake-ai@users.noreply.github.com>
2023-11-04 12:02:44 -07:00
danx0r
2f56e0eaf5 FIx #261 (#300)
* should fix issue 261 - pickle fail on DotDict class

* black patch

---------

Co-authored-by: cpacker <packercharles@gmail.com>
2023-11-03 23:33:59 -07:00
Charles Packer
6b4008c72e more stop tokens (#288) 2023-11-03 12:25:37 -07:00
Charles Packer
437306388f Improvements to JSON handling for local LLMs (#269)
* some extra json hacks

* add 'smart' json loader to other wrapers

* added chatml related stop tokens by default
2023-11-03 00:18:31 -07:00
Charles Packer
fde0087a19 Patch summarize when running with local llms (#213)
* trying to patch summarize when running with local llms

* moved token magic numbers to constants, made special localllm exception class (TODO catch these for retry), fix summarize bug where it exits early if empty list

* missing file

* raise an exception on no-op summary

* changed summarization logic to walk forwards in list until fraction of tokens in buffer is reached

* added same diff to sync agent

* reverted default max tokens to 8k, cleanup + more error wrapping for better error messages that get caught on retry

* patch for web UI context limit error propogation, using best guess for what the web UI error message is

* add webui token length exception

* remove print

* make no wrapper warning only pop up once

* cleanup

* Add errors to other wrappers

---------

Co-authored-by: Vivian Fang <hi@vivi.sh>
2023-11-02 23:44:02 -07:00
Vivian Fang
b9c229de35 Update README.md 2023-11-01 18:32:19 -07:00