A secure cloud Linux computer powered by E2B Desktop Sandbox and controlled by open-source LLMs.
Desktop.Use.+.Streaming.mp4
- Uses E2B for secure Desktop Sandbox
- Uses Llama 3.2, 3.3 and OS-Atlas
- Operates the computer via a combination of keyboard, mouse, and shell commands
- Live streams the display of the sandbox on the client computer
- The user can pause the agent and provide feedback and any time
- Designed to work on any operating system or platform
- Supports a number of inference providers, including Hugging Face, Fireworks, OpenRouter, etc.
The details of the design are laid out in this article: How I taught an AI to use a computer
- Python 3.10 or later
- git
- E2B API key
- Fireworks API key
In your terminal:
brew install poetry ffmpeg
In your terminal:
git clone https://github.com/e2b-dev/open-computer-use/
Enter the project directory:
cd open-computer-use
Create a .env
file in open-computer-use
and set the following:
# Get your API key here - https://e2b.dev/
E2B_API_KEY="your-e2b-api-key"
FIREWORKS_API_KEY="your-fireworks-api-key"
Run the following command to start the agent:
poetry install
poetry run start
The agent will start and prompt you for its first instruction.
Open Computer Use supports a variety of LLMs and LLM providers, which are defined in models.py
.
The following lines of code can be changed to any valid combination of model and provide, so long as the new vision model supports vision input and the new action model supports tool use.
vision_model = FireworksProvider(model_names["fireworks"]["llama3.2"])
action_model = FireworksProvider(model_names["fireworks"]["llama3.3"])
If you add models or define a new provider, feel free to make a PR to this repository.