Commit History

Marked the logger as a daemon thread so it doesn't prevent the exit of the python interpreter
d5cf91c

alfraser commited on

Fixed bug where the Logger was logging its own name and not that of the architecture.
30696ca

alfraser commited on

Added environment variable to explicitly flag to the tokenizers that we are doing multi-threading and to prevent a bunch of warnings arising
dd89a23

alfraser commited on

Added ability to set the number of testing threads dynamically from the UI
fc8884e

alfraser commited on

Implemented single threaded worker on writing the logs to the json file for controlled access to the resource on the file system now we are multi-threading the tests.
c0a1e47

alfraser commited on

Modified test runner to dispatch requests in parallel to make use of the fact that there is a lot of wait time for the LLM. Defaulting to 16 threads.
bb7db2c

alfraser commited on

Saved trace records offline
e999f4f

alfraser commited on

Saved trace records offline
cb184aa

alfraser commited on

Removed a debug print line
edb7b35

alfraser commited on

Added runner for pricing fact checks to assess the level of fact embedding in the latest model
c319c31

alfraser commited on

Correct the numbering on the latest architecture
963fb4a

alfraser commited on

Added a setup for V7 of the fine-tuned model to test that
3b67117

alfraser commited on

Saved file records to DB. Fixed a print to show the correct test-group name.
3a9dec1

alfraser commited on

Tried a tweak to the prompt to lean in to facts
d452e98

alfraser commited on

Added a config to test V6 fine-tuned model
946c170

alfraser commited on

Saved test records and refactored reporter UI code into smaller functions
a9d1d49

alfraser commited on

Updated the testing page to show the request/response pairs
9cec719

alfraser commited on

Updated the offline save to save the actual request and response text
34061f5

alfraser commited on

Added the safety components to the fine-tuning model to make the test fair
2cc68c2

alfraser commited on

Fixed display of architecture name
d6b7bf0

alfraser commited on

Logged start of architecture invocation so if one stalls you can see what it is in the logs
bdc40cf

alfraser commited on

Tidying up old setups - leaving with the best working fine-tuned one for now
12371dd

alfraser commited on

Update to the new URL for model v5
cce95a5

alfraser commited on

Update to the new URL for model v3
624fbec

alfraser commited on

Trying fine-tuning yet another way - all run, now testing v3 of the model
a48e190

alfraser commited on

Added raw prompt format to just do passthrough so I can test a number of different examples just typed in
92aa543

alfraser commited on

Added another test prompt format
20cff9b

alfraser commited on

Added a push button to generate a random question to the UI, so users don't have to phrase something themselves.
7c479ac

alfraser commited on

Tweaked the test generator and updated the tests
ca7e5c7

alfraser commited on

Added the option to pause a failed endpoint in order to be able to kick it with a restart
5ecd875

alfraser commited on

Added the test question generator and increased the size of the question bank to 500
59b2aff

alfraser commited on

Added a missing comment
bcc302b

alfraser commited on

Fixed a bug where if the architecture had entirely failed and not generated a response the whole load of TestGroups would crash. Need to fix the root cause of the failure to generate a response, but also should be caught gracefully here in any event.
4332953

alfraser commited on

Refactored loading the TestGroups to make the structure of the json load and the DB load the same and clearer
c76e6f5

alfraser commited on

Added comments throughout
e912278

alfraser commited on

Tweaked message
7b8cf3a

alfraser commited on

Switched endpoint control to use the writeable token as it was inconsistent with the normal token.
2122072

alfraser commited on

Added front page message about the endpoints being scheduled.
c0848c6

alfraser commited on

Fixed bug with default prompt style not being valid
190ec66

alfraser commited on

Configured more architectures to try and debug the fine-tuning issue each with different prompt styles
53169ab

alfraser commited on

Trying a different prompting style
3991f6c

alfraser commited on

Added possibility to change the prompt style via parameter as I try and debug the fine-tuning issue.
ee60fb2

alfraser commited on

Password protected the system controls page
c6ae5fd

alfraser commited on

Tweaked the training data format to try and fix the issue of the model repeating the question over and over
abcd8a9

alfraser commited on

Added ability to select which models to compare side by side, allowing for more flexibility in testing my fine-tuned llamas
57b94ca

alfraser commited on

Moved trace records to DB
79f35e2

alfraser commited on

Added a time/response length plot to the charts
dc63ce8

alfraser commited on

Added loading of test groups from both the DB and the local file and merging these two
1fb12dc

alfraser commited on

Added message to system status
acca6aa

alfraser commited on

Adding the sqlite db where I will archive the test results and added the archiving code
843d9d3

alfraser commited on