Streamlining CI/CD with Pinecone Local
What is Pinecone Local?
Pinecone Local is an in-memory Pinecone Vector Database emulator available as a Docker image. It provides developers with a powerful tool for local development and testing.
It integrates smoothly into CI/CD environments, allowing efficient and cost-effective testing without a live billing account.
In this article, we’ll explore how you can use Pinecone Local in your GitHub Actions workflows to do API contract testing, reduce costs and speed up your CICD testing jobs.
Benefits of using Pinecone Local in your cloud CICD workflow
Pinecone local provides:
- Faster test execution
- Reduced cloud costs
- Improved isolation between test runs
- Consistency between development and CI environments
Combining Pinecone Local and GitHub Actions
You can use GitHub Actions and Pinecone Local to build the following workflow, which you can configure to run whenever changes are pushed on a feature branch, or merged to main:
- Pull the Pinecone Local Docker image
- Start a Pinecone Local instance for each test run
- Execute tests against the local instance
- Tear down the instance after tests complete
Here's a starter GitHub Action workflow that you can extend for your own needs:
name: Pinecone CI/CD with Local
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.x'
- name: Set up Docker
uses: docker-practice/actions-setup-docker@master
- name: Start Pinecone Local
run: |
docker pull ghcr.io/pinecone-io/pinecone-index:latest
docker run -d \
--name pinecone-local \
-e PORT=5081 \
-e INDEX_TYPE=serverless \
-e DIMENSION=768 \
-e METRIC=cosine \
-p 5081:5081 \
--platform linux/amd64 \
ghcr.io/pinecone-io/pinecone-index:latest
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install "pinecone[grpc]" pytest
- name: Run tests
env:
PINECONE_API_KEY: dummy-key
PINECONE_ENVIRONMENT: local
PINECONE_INDEX: my-index
run: |
pytest tests/
- name: Stop Pinecone Local
run: docker stop pinecone-local
A practical example: Upsert with metadata and fetch vectors
Let's look at a practical example of writing some Python code to run against our Pinecone Local instance.
To get started, we'll pull the latest Pinecone Local Docker image:
# Pull the latest Pinecone Local image
docker pull ghcr.io/pinecone-io/pinecone-index:latest
Next, we run an instance of Pinecone Local, using environment variables to configure its functionality and the port it will listen on:
# Start Pinecone Local with one index - take note of the port mappings
docker run -d \
--name index1 \
-e PORT=5081 \
-e INDEX_TYPE=serverless \
-e DIMENSION=2 \
-e METRIC=cosine \
-p 5081:5081 \
--platform linux/amd64 \
ghcr.io/pinecone-io/pinecone-index:latest
Next, install the latest pinecone-client:
pip install "pinecone-client[grpc]"
Now, we can write a test.py file with the following contents:
from pinecone.grpc import PineconeGRPC, GRPCClientConfig
import time
# Initialize a client. An API key must be passed, but the
# value does not matter.
pc = PineconeGRPC(api_key="pclocal")
# Target the indexes. Use the host and port number and disable TLS (SSL)
# connections since we're going over localhost
index1 = pc.Index(host="localhost:5081", grpc_config=GRPCClientConfig(secure=False))
# Upsert records into index1
index1.upsert(
vectors=[
{
"id": "vec1",
"values": [1.0, 1.5],
"metadata": {"genre": "comedy"}
},
{
"id": "vec2",
"values": [2.0, 1.0],
"metadata": {"genre": "drama"}
},
{
"id": "vec3",
"values": [0.1, 3.0],
"metadata": {"genre": "comedy"}
}
],
namespace="example-namespace"
)
# Wait for the indexes to be updated
time.sleep(5)
# Check the number of records in each index
print(index1.describe_index_stats())
# Query index2 with a metadata filter
query = index1.query(
vector=[1.0, 1.5],
filter={"genre": {"$eq": "comedy"}},
top_k=1,
include_values=True,
include_metadata=True,
namespace='example-namespace'
)
print(query)
Run the test file with:
python test.py
If all goes well, you should see output similar to the following:
# Output of describe_index_stats call
{'dimension': 2,
'index_fullness': 0.0,
'namespaces': {'example-namespace': {'vector_count': 3}},
'total_vector_count': 3}
# Output of query
{'matches': [{'id': 'vec1',
'metadata': {'genre': 'comedy'},
'score': 1.0,
'sparse_values': {'indices': [], 'values': []},
'values': [1.0, 1.5]}],
'namespace': 'example-namespace'}
When to use Pinecone Local in your CICD
Pinecone Local is particularly well-suited for the following CI/CD scenarios:
- Rapid Iteration: When developers need quick feedback on changes affecting vector search functionality.
- Pull Request Validation: Ensure code changes don't break existing vector search capabilities before merging.
- Integration Testing: Testing how your application interacts with Pinecone's API without affecting production data.
- Offline Development: Allowing developers to work on vector search features without an internet connection or cloud account.
Best Practices for Using Pinecone Local in CI/CD
- Environment Parity: To catch issues early, use the same Pinecone Local setup in both the local development and CI environments.
- Resource Management: Ensure proper cleanup of Pinecone Local instances after each CI run to prevent resource conflicts.
- Configuration via Environment Variables: Use environment variables to configure your tests, making it easy to switch between local and cloud environments when needed.
Conclusion
Pinecone Local offers a powerful solution for integrating vector database testing into CI/CD pipelines. Providing a containerized, in-memory emulator of Pinecone's vector database enables faster, more reliable, and cost-effective testing processes.
Pinecone Local can streamline your development workflow and make building comprehensive test coverage for your projects easier.