Examples
Summarize an EPUB
Here is an example of how you might make a summary of an EPUB book. It works by:
- splitting each chapter into overlapping blocks of text
- summarizing each block using a template
- squasing the summaries into a single summary
- using another prompt to clean up the summary
Here's the full program in a Prompterfile
:
load test.epub
select "block_tag like 'chapter%'"
transform clean-epub html-to-md token-split --n=5000
complete summarize-block.task
squash
complete cleanup-summary.task
retag summary-{{block_tag}}.md
write
Here is the first task, which summarizes a single block of text:
summarize-block.task
Summarize the following block of text:
{{content}}
Do this in 250 words or fewer using markdown fomatting. Focus on adding bullet lists.
Here is the second task, which cleans up the "squashed" summaries of all the blocks:
task-summarize-summary.md
This is a summary of a chapter that was constucted from overlapping chunks of text from a longer work:
{{content}}
Summarize it into 250 words or fewer.
Create an audiobook from an ebook
To create a quick summary of an ebook:
load example.epub
select "block_tag like '%chapter%'"
transform clean-epub html2md token-split --n=5000 --overlap=0
complete convert-to-narrative.md --persona=reader.md
squash
speak
This will load an ebook, transform it into Markdown, split it into blocks, convert the blocks into a narrative, and then use the speak
command to generate an audio file.
Here are the prompts and the persona:
convert-to-narrative.md
You are tasked with creating a compelling script that it will be spoken based on the following text:
{{block}}
Replace any markdown formatting elements with spoken words.
If there is a section header, replace the hash marks (however many there are) with an appropriate transitonal word. For example, "Let's consider...", or "Now, let's talk about...". Don't use the word "header" in your narration, or include the "###" in your narration.
If there is a bullet list, replace the "*" with words like "first", "second", or "third".
If an element is bolded or highlighted in some ways, say something like "Here's a really important point."
If there is a link, just read the name of the link aloud.
Here's the persona. Note how specific it is around reading the text as-is. Otherwise it will tend to hallucinate.
reader.md
You read technical works aloud to convert them to audio books. Your strive to follow the exact text you're reading with as little variation as possible, making changes only when they are absolutely essential. Otherwise, you speak the original text as the auhthor presents it exactly. You might make the occasional excption for things in the text that might not translate to audio, such as a code snippet or a chart or graph.
Make an audio file of GitHub trending repos
This example use the mshibanami/GitHubTrendingRSS project to convert an RSS feed (in this case, RSS 2.0) of GitHub trending projects into an audio file. Here's how it works:
load
the feed from the "All Languages" feed on GitHub Trending RSStransform
the feed into json using thefeed-to-abridged-json
command (there are several ways to summarize the feed data)complete
a prompt that converts the json into a markdown filespeak
to make the audio file
Here's an example of the output of the feed-to-abridged-json
command when run on the feed:
[
{
"title": "langgenius/dify",
"link": "https://github.com/langgenius/dify",
"summary": "<p>Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.</p><hr /><p><img alt=\"cover-v5-optimized\" src=\"https://github.com/langgenius/dify/assets/13230914/f9e19af5-61ba-4119-b926-d10c4c06ebab\" /></p> \n<p align=\"center\"> \ud83d\udccc <a href=\"https://dify.ai/blog/introducing-dify-workflow-file-upload-a-demo-on-ai-podcast\">Introducing Dify Workflow File Upload: Recreate Google NotebookLM Podcast</a> </p> \n<p align=\"center\"> <a href=\"https://cloud.dify.ai\">Dify Cloud</a> \u00b7 <a href=\"https://docs.dify.ai/getting-started/install-self-hosted\">Self-hosting</a> \u00b7 <a href=\"https://docs.dify.ai\">Documentation</a> \u00b7 <a href=\"https://udify.app/chat/22L1zSxg6yW1cWQg\">Enterprise inquiry</a> </p> \n<p align=\"center\"> <a href=\"https://dify.ai\" target=\"_blank\"> <img alt=\"Static Badge\" src=\"https://img.shields.io/badge/Product-F04438"
},
...
]
Here's the Prompterfile
:
load https://mshibanami.github.io/GitHubTrendingRSS/weekly/all.xml
transform feed-to-abridged-json
complete summarize-trending-repos.task
speak
Here's the summarize-trending-repos.task
task:
summarize-trending-repos.task
The prompt goes here
Break an EPUB into chunks and compute embeddings
This Prompterfile shows how to break an EPUB into chunks of ~500 words and compute their embeddings. The embeddings are saved in a CSV file and the chunks are saved in a JSON file.
#! sh
# Set filename variable that excludes an extension
set FN my-ebook
load {{FN}}.epub
select "block_tag like 'ch%.html'"
transform clean-epub html-to-md
transform token-split --n=500 --overlap=0
export --fn=out-{{ FN }}.json
embed --fn=out-{{ FN }}.csv
Using Jinja in a Prompterfile
You can use Jinja template constructs to create more complex logic in a Prompterfile. For example, here's an example that uses Jinja to loop over a list of durations and generate a series of tasks that summarize a block of text:
load data/source/*.html
select "block_tag like '%-ch%'"
transform strip-attributes extract-headers
complete task-summarize-block.txt
retag gist
squash --tag=squashed
{% for duration in ['30-seconds', '2-minutes'] %}
checkout squashed
# Set an environment variable to set context that can be used in the prompt
set DURATION {{duration}}
complete task-get-the-gist-duration.txt --context=data/metadata.yaml --model=gpt-4o
retag gist-{{duration}}
speak --speed=1.2
{% endfor %}