Tijmen2 commited on
Commit
cf5d663
1 Parent(s): ff243da

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +122 -0
README.md CHANGED
@@ -1,3 +1,125 @@
1
  ---
2
  license: mit
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ language:
4
+ - en
5
+ pipeline_tag: question-answering
6
  ---
7
+ # cosmosage
8
+
9
+ Cosmosage is a natural-language cosmology assistant that can answer questions about cosmology.
10
+
11
+ cosmosage_v1 is a fine tune of fblgit/UNA-TheBeagle-7b-v1 on various cosmology-related datasets including
12
+ open-access textbooks and scientific publications. It is primarily intended for Q&A mode, where the model
13
+ gives a single answer in response to a single question, though it performs decently as a chatbot, as well.
14
+ See https://github.com/tijmen/cosmosage for more details.
15
+
16
+ ## Usage
17
+
18
+ One convenient way of using this model is with `text-generation-webui`
19
+
20
+ https://github.com/oobabooga/text-generation-webui
21
+
22
+ - Under the Model tab, download custom model, entering "Tijmen2/cosmosage_v1_gptq"
23
+ - When the download is complete, click the refresh icon next to Model.
24
+ - Select AutoGPTQ as your model loader.
25
+ - In the Model dropdown, choose `cosmosage_v0.4` and load it.
26
+ - Head over to the Notebook tab and ask away!
27
+
28
+ ## Prompt Style
29
+
30
+ The model was trained with data formatted like
31
+ ```
32
+ {system, optional}
33
+ USER: {question}
34
+ ASSISTANT: {answer}
35
+ ```
36
+
37
+ Feel free to experiment with different prompts and sampling parameters. No system prompt, i.e.
38
+ ```
39
+ USER: <<<your question here>>>
40
+ ASSISTANT:
41
+ ```
42
+ works well, but usually leads to very short answers to questions. If you want longer responses, you should
43
+ ask for them in the system prompt.
44
+
45
+ The following is an example prompt that that leverages the model's instruction-following capabilities as
46
+ well as two-shot prompting. I've been using it with temperature=0.4, min_p=0.1.
47
+
48
+ ```
49
+ You are an expert cosmology assistant, able to answer questions on the cosmic microwave background,
50
+ galaxy formation, large scale structure, theoretical cosmology, inflation, big bang nucleosynthesis, cosmology
51
+ instrumentation, and other related topics. Please assume the user is fluent in scientific terminology. Elaborate
52
+ where possible to give a complete answer. If you do not know, say you do not know. You will get two examples,
53
+ then be expected to answer one question to the best of your ability.
54
+
55
+
56
+ ## EXAMPLE 1
57
+ USER: What is the ISW effect?
58
+ ASSISTANT: The Integrated Sachs-Wolfe effect describes the fact that cosmic microwave background photons gain
59
+ energy as they fall into potential wells and lose it again as they climb back out. The origin of the effect is
60
+ gravitational redshift. For static potential wells, there is no net effect. However, if the potentials grow or
61
+ decay, there is a net effect, known as the Integrated Sachs-Wolfe effect. In a matter-dominated universe, linear
62
+ potentials stay constant. The late-time ISW effect refers to the fact that at late times, dark energy causes a
63
+ deviation from matter domination and linear potentials decay. This causes an effect in the cosmic microwave
64
+ background which has been measured in cross-correlation with large-scale structure. The early-time ISW refers
65
+ to the fact that right after recombination, radiation still contributed a non-negligible amount of the energy
66
+ density of the universe, causing potentials to grow. This causes the photons to gain energy. The early-time ISW
67
+ has not been detected experimentally.
68
+
69
+ ## EXAMPLE 2
70
+ USER: What are the definitions of galaxy cluster mass (M200m, M500c, etc.)?
71
+ ASSISTANT: Calculating a galaxy cluster's mass is like calculating the population of a sprawling city; as you
72
+ get further out from city center, the suburbs never end. Eventually you have to stop counting and just say that
73
+ from there on the people belong to a different town.
74
+ In cluster masses, we integrate from the cluster center out to some radius rΔ where Δ might be 200, 500, or
75
+ some other number. The definition of this radius is that the density of the enclosed mass is Δ times the
76
+ average density of the universe at that redshift. The associated mass is MΔ.
77
+ Note that 'average density' can be defined in two ways. You can either include the energy density of dark
78
+ energy - in which case you are comparing to the critical density at that redshift - or you can use the matter
79
+ density only. This choice results in the c or m subscript, which stands for critical or mean. Note that the
80
+ critical density is higher than the mean density, so it gives an earlier cutoff in the integral. Therefore
81
+ 'MΔc' masses are smaller than 'MΔm' ones.
82
+
83
+ ## QUESTION
84
+ USER: <<<INSERT YOUR QUESTION>>>
85
+ ASSISTANT:
86
+ ```
87
+
88
+ ## Example outputs
89
+
90
+ ```
91
+ USER: What is the current best measure of the optical depth to reionization?
92
+ ASSISTANT: The current best measure of the optical depth to reionization is τ=0.054±0.007, as determined by the
93
+ Planck Collaboration et al. (2018b). This value provides strong evidence for reionization at high redshift and
94
+ helps constrain models of cosmic reionization.
95
+
96
+ USER: In DfMux, the SQUID is nulled via Digital Active Nulling at MHz frequencies. How is the SQUID kept nulled
97
+ at low frequency?
98
+ ASSISTANT: In DfMux, the SQUID is kept nulled at low frequency by using a low-frequency flux-locked loop (FLL).
99
+ The FLL is a feedback system that maintains the SQUID in its nulled state by adjusting the input current to
100
+ cancel out the flux signal. This is necessary because the SQUID is sensitive to magnetic fields, and the
101
+ magnetic field generated by the input current can change the bias flux, causing the SQUID to lose its null
102
+ state. The FLL helps to maintain the null state by continuously adjusting the input current to counteract any
103
+ changes in the bias flux.
104
+
105
+ USER: Why does a cross-spectrum not suffer from noise bias?
106
+ ASSISTANT: A cross-spectrum between two different tracers of the same underlying field does not suffer from
107
+ noise bias because the noise in each tracer is uncorrelated with the noise in the other tracer. This is because
108
+ the noise in each tracer is a random process that is independent of the other tracer. Therefore, the noise in
109
+ one tracer does not affect the expected value of the cross-spectrum between the two tracers. This is in
110
+ contrast to an auto-spectrum, which does suffer from noise bias because the noise in one tracer is correlated
111
+ with the noise in the same tracer.
112
+
113
+ USER: Summarize "Impact of beam far side-lobe knowledge in the presence of foregrounds for LiteBIRD".
114
+ ASSISTANT: This paper investigates the impact of beam far side-lobes on the estimation of the tensor-to-scalar
115
+ ratio in the presence of foregrounds for the LiteBIRD mission. The authors find that the presence of foregrounds
116
+ can significantly affect the estimation of the tensor-to-scalar ratio, and that the knowledge of the beam far
117
+ side-lobes is crucial for accurate results. The study also discusses the importance of accurately modeling the
118
+ beam profile and the challenges in estimating the tensor-to-scalar ratio in the presence of foregrounds.
119
+ ```
120
+
121
+ ## Qualitative evaluation
122
+
123
+ cosmosage_v1 is by far the best cosmosage model yet. However, like all LLMs, it continues to struggles with
124
+ _reliability_. While many of its answers are factually accurate, some are not. The outputs of cosmosage
125
+ (or any LLM) should not be trusted to be factual.