petuumkx commited on
Commit
6889278
Β·
1 Parent(s): a0af43d

Update README: (1) add table of content; (2) change Topic to Subject; (3) Add emoji to sections; (4) Add roadmap section.

Browse files
Files changed (1) hide show
  1. README.md +36 -14
README.md CHANGED
@@ -1,20 +1,33 @@
1
- # Personalized-Arxiv-digest
2
- This repo aims to provide a better daily digest for newly published arxiv papers based on your own research interests and descriptions.
3
 
4
- ## What this repo does
5
 
6
- Staying up to date on [arxiv](https://arxiv.org) papers can take a considerable amount of time, with on the order of hundreds of new papers each day to filter through. There is an [official daily digest service](https://info.arxiv.org/help/subscribe.html), however large subtopics like [cs.AI](https://arxiv.org/list/cs.AI/recent) still have 50-100 papers a day. Determining if these papers are relevant and important to you means reading through the title and abstract.
 
 
 
 
 
 
 
 
 
7
 
8
- This repository provides a way to have this daily digest sorted by relevance via large language models:
9
 
10
- * You modify the configuration file `config.yaml` with an arxiv topic, some set of subtopics, and a natural language statement about the type of papers you are interested in
11
- * The code pulls all the abstracts for papers in those subtopics and ranks how relevant they are to your interest on a scale of 1-10 using gpt-3.5-turbo.
12
- * The code then emits an HTML digest listing all the relevant papers, and optionally emails it to you using [SendGrid](https://sendgrid.com). You will need to have a SendGrid account with an API key for this functionality to work
 
 
 
 
13
 
14
 
15
  ### Some examples:
16
 
17
- - Topic: Computer Science
18
  - Categories: Artificial Intelligence, Computation and Language
19
  - Interest:
20
  - Large language model pretraining and finetunings
@@ -25,14 +38,14 @@ This repository provides a way to have this daily digest sorted by relevance via
25
  ![example1](./readme_images/example_1.png)
26
 
27
 
28
- - Topic: Quantitative Finance
29
  - Interest: "making lots of money"
30
 
31
  ![example2](./readme_images/example_2.png)
32
 
33
- ## Usage
34
 
35
- ### Running as a github action using SendGrid.
36
 
37
  The recommended way to get started using this repository is to:
38
 
@@ -67,7 +80,7 @@ An alternative way to get started using this repository is to:
67
  - `TO_EMAIL` (only if you don't have it set in `config.yaml`)
68
  6. Manually trigger the action or wait until the scheduled action takes place.
69
 
70
- #### Running as a github action without emails
71
 
72
  If you do not wish to create a SendGrid account or use your email authentication, the action will also emit an artifact containing the HTML output. Simply do not create the SendGrid or SMTP secrets.
73
 
@@ -99,7 +112,16 @@ Install the requirements in `src/requirements.txt` as well as `gradio`. Set the
99
 
100
  Run `python src/app.py` and go to the local URL. From there you will be able to preview the papers from today, as well as the generated digests.
101
 
102
- ## Extending and Contributing
 
 
 
 
 
 
 
 
 
103
 
104
  You may (and are encourage to) modify the code in this repository to suit your personal needs. If you think your modifications would be in any way useful to others, please submit a pull request.
105
 
 
1
+ # Personalized-arXiv-digest
2
+ This repo aims to provide a better daily digest for newly published arXiv papers based on your own research interests and descriptions.
3
 
4
+ ## πŸ“š Contents
5
 
6
+ - [What this repo does](#what-this-repo-does)
7
+ * [Examples](#some-examples)
8
+ - [Usage](#usage)
9
+ * [Running as a github action using SendGrid (Recommended)](#running-as-a-github-action-using-sendgrid-recommended)
10
+ * [Running as a github action with SMTP credentials](#running-as-a-github-action-with-smtp-credentials)
11
+ * [Running as a github action without emails](#running-as-a-github-action-without-emails)
12
+ * [Running from the command line](#running-from-the-command-line)
13
+ * [Running with a user interface](#running-with-a-user-interface)
14
+ - [Roadmap](#roadmap)
15
+ - [Extending and Contributing](#extending-and-contributing)
16
 
17
+ ## πŸ” What this repo does
18
 
19
+ Staying up to date on [arXiv](https://arxiv.org) papers can take a considerable amount of time, with on the order of hundreds of new papers each day to filter through. There is an [official daily digest service](https://info.arxiv.org/help/subscribe.html), however large categories like [cs.AI](https://arxiv.org/list/cs.AI/recent) still have 50-100 papers a day. Determining if these papers are relevant and important to you means reading through the title and abstract, which is time-consuming.
20
+
21
+ This repository offers a method to curate a daily digest, sorted by relevance, using large language models. These models are conditioned based on your personal research interests, which are described in natural language.
22
+
23
+ * You modify the configuration file `config.yaml` with an arXiv Subject, some set of Categories, and a natural language statement about the type of papers you are interested in.
24
+ * The code pulls all the abstracts for papers in those categories and ranks how relevant they are to your interest on a scale of 1-10 using `gpt-3.5-turbo`.
25
+ * The code then emits an HTML digest listing all the relevant papers, and optionally emails it to you using [SendGrid](https://sendgrid.com). You will need to have a SendGrid account with an API key for this functionality to work.
26
 
27
 
28
  ### Some examples:
29
 
30
+ - Subject: Computer Science
31
  - Categories: Artificial Intelligence, Computation and Language
32
  - Interest:
33
  - Large language model pretraining and finetunings
 
38
  ![example1](./readme_images/example_1.png)
39
 
40
 
41
+ - Subject: Quantitative Finance
42
  - Interest: "making lots of money"
43
 
44
  ![example2](./readme_images/example_2.png)
45
 
46
+ ## πŸ’‘ Usage
47
 
48
+ ### Running as a github action using SendGrid (Recommended).
49
 
50
  The recommended way to get started using this repository is to:
51
 
 
80
  - `TO_EMAIL` (only if you don't have it set in `config.yaml`)
81
  6. Manually trigger the action or wait until the scheduled action takes place.
82
 
83
+ ### Running as a github action without emails
84
 
85
  If you do not wish to create a SendGrid account or use your email authentication, the action will also emit an artifact containing the HTML output. Simply do not create the SendGrid or SMTP secrets.
86
 
 
112
 
113
  Run `python src/app.py` and go to the local URL. From there you will be able to preview the papers from today, as well as the generated digests.
114
 
115
+ ## βœ… Roadmap
116
+
117
+ - [x] Support personalized paper recommendation using LLM.
118
+ - [x] Send emails for daily digest.
119
+ - [ ] Implement a ranking factor to prioritize content from specific authors.
120
+ - [ ] Support open-source models, e.g., LLaMA, Vicuna, MPT etc.
121
+ - [ ] Fine-tune an open-source model to better support paper ranking and stay updated with the latest research concepts..
122
+
123
+
124
+ ## πŸ’ Extending and Contributing
125
 
126
  You may (and are encourage to) modify the code in this repository to suit your personal needs. If you think your modifications would be in any way useful to others, please submit a pull request.
127