Commits · alfraser/llm-arch

Updated the status message on the endpoint start

7b55a1a

alfraser commited on Mar 7, 2024

Fixed bug where display value was preventing end point from starting

f280a0c

alfraser commited on Mar 7, 2024

Added logging to endpoint start

2aa6e54

alfraser commited on Mar 7, 2024

Migrated from using print statements in the application code to using logger module (left prints in files intended to be run as scripts)

2d7adb6

alfraser commited on Mar 4, 2024

Fixed bug in status bar

1979a9d

alfraser commited on Mar 4, 2024

Added system status to the default on each page with option for users to request start of the LLM inference endpoints. Set the endpoints to 15 minute timeout in line with this.

47d1763

alfraser commited on Mar 4, 2024

Updated one type hint

8dc34d0

alfraser commited on Mar 4, 2024

Reviewed comments and type hints

d50e68d

alfraser commited on Mar 4, 2024

Reviewed comments and type hints

eb8e0a0

alfraser commited on Mar 4, 2024

Reviewed comments and type hints

8f8b146

alfraser commited on Mar 4, 2024

Added on type hint

63018b5

alfraser commited on Mar 4, 2024

Reviewed for comments and type hints

564477a

alfraser commited on Mar 4, 2024

Checked type hints

2cb7b84

alfraser commited on Mar 4, 2024

Tidied up generate_data.py

a1317da

alfraser commited on Mar 4, 2024

Made updates to support automatic reload of the TestGroups after a test run

e35ef72

alfraser commited on Feb 6, 2024

Updated from using random.choices to random.sample throughout where I need a random distinct set as choices does replacement so you can get the same item twice. Discovered in pricing testing.

b897a48

alfraser commited on Feb 5, 2024

Marked the logger as a daemon thread so it doesn't prevent the exit of the python interpreter

d5cf91c

alfraser commited on Feb 5, 2024

Fixed bug where the Logger was logging its own name and not that of the architecture.

30696ca

alfraser commited on Feb 5, 2024

Implemented single threaded worker on writing the logs to the json file for controlled access to the resource on the file system now we are multi-threading the tests.

c0a1e47

alfraser commited on Feb 5, 2024

Modified test runner to dispatch requests in parallel to make use of the fact that there is a lot of wait time for the LLM. Defaulting to 16 threads.

bb7db2c

alfraser commited on Feb 1, 2024

Added runner for pricing fact checks to assess the level of fact embedding in the latest model

c319c31

alfraser commited on Feb 1, 2024

Saved file records to DB. Fixed a print to show the correct test-group name.

3a9dec1

alfraser commited on Jan 31, 2024

Updated the testing page to show the request/response pairs

9cec719

alfraser commited on Jan 30, 2024

Updated the offline save to save the actual request and response text

34061f5

alfraser commited on Jan 30, 2024

Fixed display of architecture name

d6b7bf0

alfraser commited on Jan 30, 2024

Logged start of architecture invocation so if one stalls you can see what it is in the logs

bdc40cf

alfraser commited on Jan 30, 2024

Update to the new URL for model v5

cce95a5

alfraser commited on Jan 29, 2024

Trying fine-tuning yet another way - all run, now testing v3 of the model

a48e190

alfraser commited on Jan 29, 2024

Added raw prompt format to just do passthrough so I can test a number of different examples just typed in

92aa543

alfraser commited on Jan 26, 2024

Added another test prompt format

20cff9b

alfraser commited on Jan 26, 2024

Tweaked the test generator and updated the tests

ca7e5c7

alfraser commited on Jan 26, 2024

Added the test question generator and increased the size of the question bank to 500

59b2aff

alfraser commited on Jan 26, 2024

Added a missing comment

bcc302b

alfraser commited on Jan 26, 2024

Fixed a bug where if the architecture had entirely failed and not generated a response the whole load of TestGroups would crash. Need to fix the root cause of the failure to generate a response, but also should be caught gracefully here in any event.

4332953

alfraser commited on Jan 26, 2024

Refactored loading the TestGroups to make the structure of the json load and the DB load the same and clearer

c76e6f5

alfraser commited on Jan 26, 2024

Added comments throughout

e912278

alfraser commited on Jan 26, 2024

Switched endpoint control to use the writeable token as it was inconsistent with the normal token.

2122072

alfraser commited on Jan 25, 2024

Fixed bug with default prompt style not being valid

190ec66

alfraser commited on Jan 25, 2024

Configured more architectures to try and debug the fine-tuning issue each with different prompt styles

53169ab

alfraser commited on Jan 25, 2024

Trying a different prompting style

3991f6c

alfraser commited on Jan 25, 2024

Tweaked the training data format to try and fix the issue of the model repeating the question over and over

abcd8a9

alfraser commited on Jan 25, 2024

Added loading of test groups from both the DB and the local file and merging these two

1fb12dc

alfraser commited on Jan 25, 2024

Adding the sqlite db where I will archive the test results and added the archiving code

843d9d3

alfraser commited on Jan 25, 2024

Added option to directly pass the HF hub token when wiping the trace file, so I can use it locally outside of streamlit. Defaulted to None avoid changing existing behaviour.

3853f7c

alfraser commited on Jan 25, 2024

Fixed bugs from the refactor of repo access. Now it should save trace again.

f10615b

alfraser commited on Jan 24, 2024

Added the test reporting structure

82130cb

alfraser commited on Jan 24, 2024

Refactored to bring common variables together. Also added a utility to get all the trace records as a list of records

8f424fc

alfraser commited on Jan 24, 2024

Fixed the time.time bug here. Also a call to reset the Chroma DB

1cb115b

alfraser commited on Jan 24, 2024

Tweaked the way the prompt is formatted going into the LLM query, to avoid the fine-tuned model giving nonsense answers

2022fec

alfraser commited on Jan 24, 2024

Flipped the default dataset to be the baseline not the "All products"

a05b15e

alfraser commited on Jan 24, 2024

Commit History

Updated the status message on the endpoint start 7b55a1a

Fixed bug where display value was preventing end point from starting f280a0c

Added logging to endpoint start 2aa6e54

Migrated from using print statements in the application code to using logger module (left prints in files intended to be run as scripts) 2d7adb6

Fixed bug in status bar 1979a9d

Added system status to the default on each page with option for users to request start of the LLM inference endpoints. Set the endpoints to 15 minute timeout in line with this. 47d1763

Updated one type hint 8dc34d0

Reviewed comments and type hints d50e68d

Reviewed comments and type hints eb8e0a0

Reviewed comments and type hints 8f8b146

Added on type hint 63018b5

Reviewed for comments and type hints 564477a

Checked type hints 2cb7b84

Tidied up generate_data.py a1317da

Made updates to support automatic reload of the TestGroups after a test run e35ef72

Updated from using random.choices to random.sample throughout where I need a random distinct set as choices does replacement so you can get the same item twice. Discovered in pricing testing. b897a48

Marked the logger as a daemon thread so it doesn't prevent the exit of the python interpreter d5cf91c

Fixed bug where the Logger was logging its own name and not that of the architecture. 30696ca

Implemented single threaded worker on writing the logs to the json file for controlled access to the resource on the file system now we are multi-threading the tests. c0a1e47

Modified test runner to dispatch requests in parallel to make use of the fact that there is a lot of wait time for the LLM. Defaulting to 16 threads. bb7db2c

Added runner for pricing fact checks to assess the level of fact embedding in the latest model c319c31

Saved file records to DB. Fixed a print to show the correct test-group name. 3a9dec1

Updated the testing page to show the request/response pairs 9cec719

Updated the offline save to save the actual request and response text 34061f5

Fixed display of architecture name d6b7bf0

Logged start of architecture invocation so if one stalls you can see what it is in the logs bdc40cf

Update to the new URL for model v5 cce95a5

Trying fine-tuning yet another way - all run, now testing v3 of the model a48e190

Added raw prompt format to just do passthrough so I can test a number of different examples just typed in 92aa543

Added another test prompt format 20cff9b

Tweaked the test generator and updated the tests ca7e5c7

Added the test question generator and increased the size of the question bank to 500 59b2aff

Added a missing comment bcc302b

Fixed a bug where if the architecture had entirely failed and not generated a response the whole load of TestGroups would crash. Need to fix the root cause of the failure to generate a response, but also should be caught gracefully here in any event. 4332953

Refactored loading the TestGroups to make the structure of the json load and the DB load the same and clearer c76e6f5

Added comments throughout e912278

Switched endpoint control to use the writeable token as it was inconsistent with the normal token. 2122072

Fixed bug with default prompt style not being valid 190ec66

Configured more architectures to try and debug the fine-tuning issue each with different prompt styles 53169ab

Trying a different prompting style 3991f6c

Tweaked the training data format to try and fix the issue of the model repeating the question over and over abcd8a9

Added loading of test groups from both the DB and the local file and merging these two 1fb12dc

Adding the sqlite db where I will archive the test results and added the archiving code 843d9d3

Added option to directly pass the HF hub token when wiping the trace file, so I can use it locally outside of streamlit. Defaulted to None avoid changing existing behaviour. 3853f7c

Fixed bugs from the refactor of repo access. Now it should save trace again. f10615b

Added the test reporting structure 82130cb

Refactored to bring common variables together. Also added a utility to get all the trace records as a list of records 8f424fc

Fixed the time.time bug here. Also a call to reset the Chroma DB 1cb115b

Tweaked the way the prompt is formatted going into the LLM query, to avoid the fine-tuned model giving nonsense answers 2022fec

Flipped the default dataset to be the baseline not the "All products" a05b15e

Updated the status message on the endpoint start

7b55a1a

Fixed bug where display value was preventing end point from starting

f280a0c

Added logging to endpoint start

2aa6e54

Migrated from using print statements in the application code to using logger module (left prints in files intended to be run as scripts)

2d7adb6

Fixed bug in status bar

1979a9d

Added system status to the default on each page with option for users to request start of the LLM inference endpoints. Set the endpoints to 15 minute timeout in line with this.

47d1763

Updated one type hint

8dc34d0

Reviewed comments and type hints

d50e68d

Reviewed comments and type hints

eb8e0a0

Reviewed comments and type hints

8f8b146

Added on type hint

63018b5

Reviewed for comments and type hints

564477a

Checked type hints

2cb7b84

Tidied up generate_data.py

a1317da

Made updates to support automatic reload of the TestGroups after a test run

e35ef72

Updated from using random.choices to random.sample throughout where I need a random distinct set as choices does replacement so you can get the same item twice. Discovered in pricing testing.

b897a48

Marked the logger as a daemon thread so it doesn't prevent the exit of the python interpreter

d5cf91c

Fixed bug where the Logger was logging its own name and not that of the architecture.

30696ca

Implemented single threaded worker on writing the logs to the json file for controlled access to the resource on the file system now we are multi-threading the tests.

c0a1e47

Modified test runner to dispatch requests in parallel to make use of the fact that there is a lot of wait time for the LLM. Defaulting to 16 threads.

bb7db2c

Added runner for pricing fact checks to assess the level of fact embedding in the latest model

c319c31

Saved file records to DB. Fixed a print to show the correct test-group name.

3a9dec1

Updated the testing page to show the request/response pairs

9cec719

Updated the offline save to save the actual request and response text

34061f5

Fixed display of architecture name

d6b7bf0

Logged start of architecture invocation so if one stalls you can see what it is in the logs

bdc40cf

Update to the new URL for model v5

cce95a5

Trying fine-tuning yet another way - all run, now testing v3 of the model

a48e190

Added raw prompt format to just do passthrough so I can test a number of different examples just typed in

92aa543

Added another test prompt format

20cff9b

Tweaked the test generator and updated the tests

ca7e5c7

Added the test question generator and increased the size of the question bank to 500

59b2aff

Added a missing comment

bcc302b

Fixed a bug where if the architecture had entirely failed and not generated a response the whole load of TestGroups would crash. Need to fix the root cause of the failure to generate a response, but also should be caught gracefully here in any event.

4332953

Refactored loading the TestGroups to make the structure of the json load and the DB load the same and clearer

c76e6f5

Added comments throughout

e912278

Switched endpoint control to use the writeable token as it was inconsistent with the normal token.

2122072

Fixed bug with default prompt style not being valid

190ec66

Configured more architectures to try and debug the fine-tuning issue each with different prompt styles

53169ab

Trying a different prompting style

3991f6c

Tweaked the training data format to try and fix the issue of the model repeating the question over and over

abcd8a9

Added loading of test groups from both the DB and the local file and merging these two

1fb12dc

Adding the sqlite db where I will archive the test results and added the archiving code

843d9d3

Added option to directly pass the HF hub token when wiping the trace file, so I can use it locally outside of streamlit. Defaulted to None avoid changing existing behaviour.

3853f7c

Fixed bugs from the refactor of repo access. Now it should save trace again.

f10615b

Added the test reporting structure

82130cb

Refactored to bring common variables together. Also added a utility to get all the trace records as a list of records

8f424fc

Fixed the time.time bug here. Also a call to reset the Chroma DB

1cb115b

Tweaked the way the prompt is formatted going into the LLM query, to avoid the fine-tuned model giving nonsense answers

2022fec

Flipped the default dataset to be the baseline not the "All products"

a05b15e