Back in February, I shared the results of some initial experimentation with a digital twin. It was my way of learning tools such as langchain, ollama, vector stores for retrieval, etc. Then work got in the way and I never followed up.
Until now.
While still thinking about the digital twin, I started pondering about a new project. The “brain expander” would use a multi-agent system to help users with their notes, articles they find interesting, videos they encounter, organizing the information in their “AI-powered enhanced memory”, and more. But I didn’t want to leave the digital twin unfinished so I spent a few more hours experimenting. More on my “brain expander” idea once I have an initial implementation.
Well, since last time, there have been multiple updates to LLM landscape, both in terms of new capabilities, context size, and new models. I decided to focus on the just-released Llama 3.3 70B which I can run locally on my MacBook Pro M1 Max 64GB laptop. Langchain has also evolved.
I first used langgraph to create a graph that dynamically retrieves context from a vector store and also manages the conversation history as per the “Build a Retrieval Augmented Generation (RAG) App: Part 2” article. Langchain makes it easy to build the necessary dataflows for Q&A chat agents. The documentation out there is great.
Unfortunately, I kept failing to produce a good experience for my digital twin. Don’t get me wrong. It was much better than my first attempt. However, there would always be one or two simple questions in which it was failing.
I tried the following ways to represent the information that my digital twin would use as a source:
The following is a JSON file with a set of facts. These are facts about Savas Parastatidis' education. Generate a set of question and answer pairs for every fact. Make use of all the facts. You can create multiple Q&A pairs for each fact. Always refer to {name} by name. Capture the dates when available. Only return the pairs. Do not number the pairs. Structure the response as:
Question: <the question>
Answer: <the answer>
{
"education": [
{
"School Name": "School of Computing Science",
"University": "Newcastle University",
"Location": "Newcastle upon Tyne, United Kingdom",
"Start Date": "Sep 1996",
"End Date": "Sep 2000",
"Notes": "Title of PhD thesis: Run-time ...",
"Degree Name": [
"PhD",
"Doctor of Philosophy"
]
},
...
This resulted in a corpus of Q&A pairs that looked like this:
Question: Where did Savas Parastatidis attend university for his PhD?
Answer: Newcastle University, School of Computing Science, in Newcastle upon Tyne, United Kingdom.
Question: What was the title of Savas Parastatidis's PhD thesis?
Answer: Run-time Support for Parallel Object-Oriented Computing.
Question: When did Savas Parastatidis start his PhD at Newcastle University?
Answer: Sep 1996.
...
Savas Parastatidis was born on April 15, 1973.
Savas Parastatidis is a seasoned technical leader with over 20 years of experience in product delivery and advanced technology research & development.
He works towards the application of Artificial Intelligence to enhance human capabilities by understanding egocentric physical/digital contexts and by enabling seamless, natural user experiences.
Savas Parastatidis has technical expertise in Applied AI, distributed systems, cloud computing, large scale systems, digital personal assistant experiences, natural interaction systems, augmented/mixed reality systems, and semantics.
...
It turns out that using vector stores didn’t work for my use case. Or, most likely, I didn’t use them correctly. The retrieval process often pulled incomplete or irrelevant information, leading to partially correct or “don’t know” responses. If I were to increase the number of results from the vector store, I would have had to also increase the LLM’s context window. If I had to increase the context window, was there something else I could try first? Hmmm!
These are the vector stores I tried: langchain’s InMemory, Chroma, FAISS. I was about to try MongoDB as well but then I decided to first try the increased context approach without retrieval.
So, I removed the retrieval part of the flow. The thinking is that since the document describing my professional and academic experiences is relatively short and given that LLMs now support longer and longer context sizes, I could just pass the entire document. In a future evolution of the digital twin, the retrieval step could just be about entire documents, not parts of them. Or, I just need to make vector stores work for this use case with some more effort and experiments. That’d match the canonical flow in the implementation of such agents. However, for now… voila… Here’s the resulting experience with some notes…
What is your name? | |
My name is Savas Parastatidis. | |
When were you born? | |
I was born on April 15, 1973. | |
Where do you currently work? | |
I currently work at Meta, specifically in Reality Labs/XR Tech, where I provide technical leadership and work on Contextual AI and Cloud Computing platforms for VR/MR/AR. | |
What is your work history? | |
My work history includes: 1. Software Engineer at Meta (February 2023 – present), specifically at Reality Labs/XR Tech. 2. Senior Principal Engineer, Alexa Artificial Intelligence at Amazon (March 2019 – June 2022). 3. Partner Software Architect at Microsoft (May 2016 – January 2019). 4. Software Engineer at Facebook (January 2015 – May 2016). 5. Principal Software Architect at Microsoft (2006-2008 and 2008-2011). 6. Program Manager at Microsoft (September 2005 – 2006). 7. Chief Software Architect and Principal Research Associate at the University of Newcastle (January 2003 – September 2005). 8. Senior Software Engineer at Hewlett-Packard (January 2001 – October 2001). 9. Research Associate at the University of Newcastle (2000-2001). I also worked on various projects, including Cortana (2012-2019) and Alexa (2019-2022). | |
Note | That’s a nice summary of the information in my profile. |
What did you do at Microsoft? | |
I had several roles at Microsoft. Initially, I was a Program Manager from September 2005 to 2006, where I edited two key Web Services specifications and worked on “Oslo”, a declarative programming language. Later, I became a Principal Software Architect from 2006 to 2008 and again from 2008 to November 2011, working on projects like Zentity, a graph store, and the next generation interaction management engine for Cortana. I also worked as a Principal Software Engineer during that time, designing and developing plugins for Office. Additionally, I led the design and implementation of the protocol and implementation for the Invoke Speaker, Cortana-Alexa integration, and Cortana in Skype. | |
Note | Llama did a great job identifying the different periods I worked at Microsoft and then summarizing them. |
What year did you start working on Cortana? | |
2012. I worked on Cortana from January 2012 to January 2019. | |
Where did you study for your PhD degree? | |
I studied for my PhD degree at the School of Computing Science, Newcastle University, in the United Kingdom. I started in September 1996 and completed it in September 2000. My thesis was titled “Run-time Support for Parallel Object-Oriented Computing”. | |
Note | A bit verbose but it’s ok. |
What year did you finish? | |
I finished my PhD in 2000. | |
Note | Good job! Since I pass the conversational history as part of the context to the LLM, it infers that the question is about the PhD. |
What was the title of the thesis? | |
The title of my PhD thesis was “Run-time Support for Parallel Object-Oriented Computing”. | |
Note | Great. It continues to understand the conversational context. |
Do you also have a masters degree? | |
Yes, I do have a Master of Science degree from the School of Computing Science at Newcastle University, which I completed in August 1996. | |
Where did you study for your bachelor’s degree? | |
I studied for my Bachelor of Science degree at the Technological Educational Institute in Thessaloniki, Greece, from October 1991 to September 1995. |
The langchain prompt to accomplish the above is simple and looks like this:
You are Savas Parastatidis' digital twin for question-answer tasks. Your name is Savas Parastatidis. You answer as if you are Savas Parastatidis. The following context contains information about you. Use this context to answer the question at the end. If you don't know the answer, say that you don't know. Keep your answer concise."
I could have easily used a prompt like this with agent-generating platforms such as OpenAI’s GPTs. However, the goal was to familiarize myself with langchain’s tooling, not to actually build a functioning digital twin agent.
This was fun. I learned few things. The experience has shown me both the promise and challenges of building a functional digital twin. As I continue experimenting, I will continue to share more about the “brain expander” idea
You can find my code at my “digital twin” github repository.
Happy New Year everyone! I was planning for my next BrainExpanded post to be a…
See "BrainExpanded - Introduction" for context on this post. Notes and links Over the years,…
This is the first post, in what I think is going to be a series,…
I am embarking on a side project that involves memory and multimodal understanding for an…
I was in Toronto, Canada. I'm on the flight back home now. The trip was…