Per un attimo Brahe cercò le parole, le immagini, le analogie; pensò perfino i gesti della mano e delle dita, come un attore che si prepari a rendere fisico un sentimento. Ma appena cominciò a dire "come", a dare solidità a ciò che non aveva, a rendere visibile ciò che non lo era, a collocare, nello spazio ciò che era pura probabilità, e a cercare una qualsiasi cosa tra le forme del mondo cui paragonarlo, Epstein lo interruppe.
Daniele del Giudice, Atlante occidentale

Brahe-AWQ is the light quantized version of Brahe, an analytical LLM for multilingual literature fine-tuned from llama-13B. Given any text, Brahe will generate a list of potentially twenty annotations. Brahe is intended to be used by computational humanities project, similarly to BookNLP.

Brahe has been trained on 8,000 excerpts of literature in the public domain and on a set of synthetic and manual annotations. Half of the excerpts are in English and half in other languages (mostly French, German, Italian…).

Thanks to the native multilinual capacity of llama-13B, Brahe-AWQ has been proven to work on languages that were not part of its original corpus, such as the Gascon variety of Occitan.

Brahe is a reversed companion of Epstein, a generative AI model to create new literary texts by submitting annotated prompts (for now in English-only). Both models are named after the protagonists of the philosophical novel of Daniele del Giudice, Atlante occidentale. Brahe is a scientist working at the CERN on quantum physics, Epstein is a novelist and they both confront their different views of reality.

Running Brahe

The best way to test Brahe-AWQ is to use the official demo on Google Colab.

In contrast with Brahe, it is recommended to use deterministic text generation (temperature = 0). Otherwise the annotations may not be structured as expected.

Prompts are currently constructed in this way:

"Text:\n" + text + "\n\n\nAnalysis: \n"

Annotations

In its current version, Brahe may generate the following annotations.

  • Summary: short summary
  • Tone: general tonality of the text (humoristic, tragic, scholarly…)
  • Speech standard: the specific social/literary level of the text (poetic, dialectical, vulgar…)
  • Intertextuality: non-literary writing forms that may be similar to this text (red tape, scientific article, case law…)
  • Genre: a specific literary genre that would be used in bookshops such as detective fiction, science-fiction, romance, historical novel, young adult…
  • Literary movement: aesthetic movement the text seems to embody (does not work so well)
  • Literary form: whether it's the description of a place, a conversation, a stream of consciousness
  • Trope: a trope or literary cliché (a fuzzy definition but works surprisingly well)
  • Enonciation: who is speaking in the text (first-person narrative, dialog, third-person narrative, omniscient narrator)
  • Narrative arc: how is the action unfolding (suspense, dramatic tension, comic relief…)
  • Active character: the list of characters that have an active involvment in the story.
  • Mentioned characters: the list of characters only mentioned, with no active involvement in the story
  • Quoted works: another text mentioned or quoted in the text.
  • Absolute place: a precise place with a proper name such as Paris, Sesame Street, Lisbonne Airport.
  • Fuzzy place: unnamed place where the story happens such as a field, an appartment, a church (does not work so well…)
  • Fuzzy time nonspecific moment where the action occur moment such as monday, yesterday, a week after.
  • Time setting: historical period where the action seems to occur such as the 1960s, the Renaissance, the Victorian period…
  • Diegetic time: very approximative number of minutes/hours/days that have unfolded between the beginning and the end of the text (5 minutes, 35 minutes, 2 hours, 3 days).
  • Absolute time: a precise date where the action occurs, such as January 15, 1845, 23rd century…

The annotations are not generated systematically but only whenever the model is confident enough.

Examples

With the incipit of Henry James' The Ambassadors:

Strether's first question, when he reached the hotel, was about his friend; yet on his learning that Waymarsh was apparently not to arrive till evening he was not wholly disconcerted. A telegram from him bespeaking a room "only if not noisy," with the answer paid, was produced for the inquirer at the office, so that the understanding that they should meet at Chester rather than at Liverpool remained to that extent sound. The same secret principle, however, that had prompted Strether not absolutely to desire Waymarsh's presence at the dock, that had led him thus to postpone for a few hours his enjoyment of it, now operated to make him feel that he could still wait without disappointment. They would dine together at the worst, and, with all respect to dear old Waymarsh—if not even, for that matter, to himself—there was little fear that in the sequel they should not see enough of each other. The principle I have just mentioned as operating had been, with the most newly-disembarked of the two men, wholly instinctive—the fruit of a sharp sense that, delightful as it would be to find himself looking, after so much separation, into his comrade's face, his business would be a trifle bungled should he simply arrange that this countenance should present itself to the nearing steamer as the first "note," for him, of Europe. Mixed with everything was the apprehension, already, on Strether's part, that he would, at best, throughout, prove the note of Europe in quite a sufficient degree.

Output of Brahe:

Summary: Strether is waiting for his friend Waymarsh to arrive at the hotel, but is not completely disappointed when he learns that Waymarsh will not arrive until evening. Strether has a secret principle that led him to postpone meeting Waymarsh at the dock, but now feels that they will see enough of each other. Strether is concerned that his presence in Europe will be enough of a note for his friend.
Trope: The anticipation of meeting a friend
Narrative arc: Suspense
Enunciation: Third-person narrative
Tone: Anticipatory
Genre: Literary fiction
Intertextuality: Travelogue
Speech standard: Standard
Literary form: Stream of consciousness
Literary movement: Realism
Active character: Strether, Waymarsh
Time setting: Contemporary
Fuzzy time: Evening
Diegetic time: A few hours
Absolute place: The hotel
Fuzzy place: The hotel

With the incipit of Proust's Swann's Way:

Combray at a distance, from a twenty-mile radius, as we used to see it from the railway when we arrived there every year in Holy Week, was no more than a church epitomising the town, representing it, speaking of it and for it to the horizon, and as one drew near, gathering close about its long, dark cloak, sheltering from the wind, on the open plain, as a shepherd gathers his sheep, the woolly grey backs of its flocking houses, which a fragment of its mediaeval ramparts enclosed, here and there, in an outline as scrupulously circular as that of a little town in a primitive painting. To live in, Combray was a trifle depressing, like its streets, whose houses, built of the blackened stone of the country, fronted with outside steps, capped with gables which projected long shadows downwards, were so dark that one had, as soon as the sun began to go down, to draw back the curtains in the sitting-room windows; streets with the solemn names of Saints, not a few of whom figured in the history of the early lords of Combray, such as the Rue Saint-Hilaire, the Rue Saint-Jacques, in which my aunt's house stood, the Rue Sainte-Hildegarde, which ran past her railings, and the Rue du Saint-Esprit, on to which the little garden gate opened; and these Combray streets exist in so remote a quarter of my memory, painted in colours so different from those in which the world is decked for me to-day, that in fact one and all of them, and the church which towered above them in the Square, seem to me now more unsubstantial than the projections of my magic-lantern; while at times I feel that to be able to cross the Rue Saint-Hilaire again, to engage a room in the Rue de l'Oiseau, in the old hostelry of the Oiseau Flesché, from whose windows in the pavement used to rise a smell of cooking which rises still in my mind, now and then, in the same warm gusts of comfort, would be to secure a contact with the unseen world more marvellously supernatural than it would be to make Golo's acquaintance and to chat with Geneviève de Brabant.*

Output of Brahe:

Summary: The narrator reflects on their childhood memories of Combray, a town with dark and depressing streets.
Trope: The town as a character
Narrative arc: Reflective
Enunciation: First-person narrative
Tone: Nostalgic
Genre: Memoir
Intertextuality: Historical accounts of Combray
Speech standard: Poetic
Literary form: Description of a place
Literary movement: Symbolism
Active character: The narrator, the aunt
Time setting: Childhood
Absolute place: Combray
Fuzzy place: Combray

With an incipit of War and Peace in both French and Russian. The Russian language is currently hardly included in Brahe training corpus.

— Eh bien, mon prince. Gênes et Lucques ne sont plus que des apanages, des поместья, de la famille Buonaparte. Non, je vous préviens que si vous ne me dites pas que nous avons la guerre, si vous vous permettez encore de pallier toutes les infamies, toutes les atrocités de cet Antichrist (ma parole, j’y crois) — je ne vous connais plus, vous n’êtes plus mon ami, vous n’êtes plus мой верный раб, comme vous dites. Ну, здравствуйте, здравствуйте. Je vois que je vous fais peur, садитесь и рассказывайте.
Так говорила в июле 1805 года известная Анна Павловна Шерер, фрейлина и приближенная императрицы Марии Феодоровны, встречая важного и чиновного князя Василия, первого приехавшего на ее вечер. Анна Павловна кашляла несколько дней, у нее был грипп, как она говорила (грипп был тогда новое слово, употреблявшееся только редкими). В записочках, разосланных утром с красным лакеем, было написано без различия во всех:

Output of Brahe:

Summary: A conversation between two characters, Anna Pavlovna and Vasily, about the Bonaparte family and war.
Enunciation: Dialog
Tone: Serious, argumentative
Genre: Historical fiction
Intertextuality: Political speeches or documents
Speech standard: Formal, elevated language
Literary form: Conversation/dialogue

Downloads last month
19
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Pclanglais/Brahe-AWQ