What should "encrypted" mean for developer tools?
"Encrypted" should mean that your data is transformed into ciphertext using a key that only you possess, before it leaves your machine, and that no one else, not the vendor, not their employees, not a government with a subpoena, can read it. In practice, most products that say "encrypted" mean something weaker: the data is scrambled on the vendor's servers using keys the vendor controls. This protects against stolen hard drives at the data center but not against the vendor itself. True encryption for developer tools requires client-side encryption with user-held keys, where the vendor receives only ciphertext they cannot decrypt. The cipher (AES-256-GCM, ChaCha20-Poly1305) matters less than the key management architecture. Who holds the key determines who can read the data.
The short version
"Encrypted" should mean your files get scrambled on YOUR computer, with a key only YOU have, before anything gets sent anywhere. Nobody else can unscramble them. Not the company. Not their employees. Not a court order.
Most tools that say "encrypted" mean something weaker. They scramble your stuff on their servers with their own keys. That protects against someone stealing a hard drive from their data center, but it does not protect against the company itself reading your work.
What matters
The specific scrambling method matters less than WHO holds the key. If the company holds it, they can read your files. If you hold it, they cannot.
"Encrypted" in the developer tooling space has become a compliance checkbox, not an architectural property. The meaningful distinction is between server-side encryption (vendor holds keys, vendor can read data) and client-side encryption (user holds keys, vendor receives only ciphertext).
This matters for AI infrastructure. If your MCP server configs, RAG pipeline credentials, or agent system prompts are stored with a vendor who holds the decryption key, those are readable by the vendor. The cipher strength (AES-256-GCM vs ChaCha20-Poly1305) is irrelevant to this threat. Key custody is the only variable that determines whether "encrypted" means protected or merely compliant.
Parallel: an LLM provider that stores your fine-tuning data "encrypted" with their keys can read every training example. The encryption protects against disk theft at the data center, not against the provider itself.
The marketing word
Every cloud storage product says "encrypted." Every backup service says "secure." Every SaaS dashboard has a lock icon somewhere on the page. The word "encrypted" appears on marketing pages the way "organic" appears on food labels. It tells you something happened, but not what.
The problem is not that these companies are lying. Most of them are telling the truth. Your data probably is encrypted at some point in its journey. The problem is that "encrypted" without context is meaningless. It does not tell you when the encryption happens, where the key lives, or who can read the plaintext.
A padlocked shipping container is encrypted. So is a diary with the key taped to the cover. Both are technically "encrypted." Only one of them is protecting anything.
This post breaks down the spectrum from bare minimum to actual protection. No code. No product pitches. A mental model you can apply to every "encrypted" claim you see on a product page.
Every tool you use says "encrypted" somewhere on its website. Every backup service says "secure." There is a lock icon on every dashboard. But what does that actually mean for your projects?
Here is the problem
These companies are not lying. Your data probably IS encrypted at some point. But "encrypted" without context is like "organic" on a food label. It tells you something happened, not what.
Think of it this way: a padlocked shipping container is "encrypted." So is a diary with the key taped to the cover. Both are technically encrypted. Only one is actually protecting anything.
This post gives you a simple mental model so you can tell the difference every time you see "encrypted" on a product page.
"Encrypted" has become a compliance signal, not a technical descriptor. Every cloud storage product, every SaaS dashboard, every backup service uses the word. It appears on marketing pages with the same specificity as "AI-powered": it indicates a category of technology was applied, but not how, when, or with what guarantees.
The problem is not dishonesty. Most vendors are telling the truth. The problem is that "encrypted" without specifying when encryption occurs, where the key resides, and who can access plaintext is operationally meaningless. It does not survive a threat model analysis.
For AI engineers shipping agent configs, vector store credentials, and pipeline secrets: the word "encrypted" on a vendor page is the beginning of the evaluation, not the conclusion. This post provides the framework for that evaluation.
The four levels of encryption
Not all encryption is equal. There are four distinct levels, and the difference between them is enormous. Most products sit at Level 2. Most developers assume they are getting Level 4.
Level 1: Encrypted in transit only (TLS)
This is the bare minimum. Your data is encrypted while it travels between your machine and the server. That is what the lock icon in your browser means. The protocol is TLS (Transport Layer Security), and it prevents someone sitting on the same Wi-Fi network from reading your traffic.
But once your data reaches the server, the encryption stops. The server receives your data in plaintext. It can read it, index it, process it, and store it however it wants. TLS protects the pipe, not the data.
Every legitimate website has TLS. If a service proudly claims "encrypted" and all they mean is TLS, they are claiming credit for the absolute minimum. That is like a restaurant advertising "we have running water."
Level 2: Encrypted at rest with vendor-managed keys
This is where most cloud services sit. Your data is encrypted on disk using keys that the vendor generates, stores, and manages. AWS S3, Google Cloud Storage, Azure Blob Storage, Dropbox, Google Drive, and most SaaS products use this model.
The encryption is real. If someone physically stole a hard drive from the data center, they could not read your files. That is what "at rest" encryption protects against: physical theft and certain types of insider access.
But the vendor holds the key. They can decrypt your data whenever they want. They can read it for feature development. They can scan it for policy violations. They can hand it to law enforcement with a valid subpoena. They can accidentally expose it through a misconfigured API.
Level 2 encryption protects your data from everyone except the vendor. If your threat model includes the vendor, Level 2 does nothing for you.
Level 3: Encrypted at rest with customer-managed keys (BYOK)
Some cloud services let you bring your own key (BYOK) or use a customer-managed key (CMK). You generate the encryption key. The vendor uses your key to encrypt and decrypt your data. If you revoke the key, the vendor can no longer access the data.
This is better than Level 2 because the customer has a kill switch. But there is a catch: the vendor still sees your key during encryption and decryption operations. The key passes through their system. They process your plaintext data in memory on their servers. They promise not to store the key, but the technical architecture allows them to.
BYOK is a trust reduction, not a trust elimination. You are trusting the vendor with your key less often and with more explicit permission. But you are still trusting them.
Level 4: Client-side encryption
This is the level most people imagine when they hear "encrypted." Your data is encrypted on your machine before it ever leaves. The vendor receives ciphertext. They store ciphertext. They never see the plaintext. They never see the key. They could not read your data even if they tried.
If the vendor is compromised, your data is still encrypted. If a law enforcement request arrives, the vendor can only hand over ciphertext. If an insider goes rogue, they get binary noise. The vendor is structurally unable to access your data because the architecture prevents it, not because a policy says they will not.
Level 4 is what "zero-trust" encryption actually means. Not "we trust them and they promise to be careful." Not "the policy says they will not look." But: the math prevents it.
The four levels, compared
| Level | Name | Who holds the key | Who can read the data |
|---|---|---|---|
| 1 | In transit (TLS) | Nobody (temporary session key) | The vendor, after receipt |
| 2 | At rest, vendor key | The vendor | The vendor, any time |
| 3 | At rest, customer key (BYOK) | Customer (but vendor sees it) | The vendor, during operations |
| 4 | Client-side encryption | Customer only | Customer only |
Most "encrypted" products are Level 2. A few offer Level 3. Very few are Level 4. When a product page says "encrypted," your first job is to figure out which level they mean.
There are four levels of encryption. Most tools sit at Level 2. Most people think they are getting Level 4. Here is the difference.
Level 1: Scrambled while traveling
Your files are protected while they move between your computer and the server. That is the lock icon in your browser. But once they arrive, the server can read everything. This is the bare minimum. Every website does this.
Level 2: Scrambled on their servers, with their key
This is where most services sit (Dropbox, Google Drive, most SaaS). Your files are scrambled on disk, but the company holds the key. They can read your stuff whenever they want. If someone steals a hard drive from the data center, your files are safe. But the company itself? They have full access.
Level 3: You bring the key, but they still see it
Some services let you provide your own key. Better, because you have a kill switch. But the key still passes through their system during use. They promise not to keep it, but the setup allows them to. It reduces trust. It does not eliminate it.
Level 4: Scrambled on YOUR machine, key never leaves
This is what most people imagine "encrypted" means. Your files are scrambled on your computer before they go anywhere. The service only ever sees scrambled data. They cannot read it even if they try. If they get hacked, your stuff is still safe. The math prevents access, not a policy.
Most "encrypted" tools are Level 2. A few offer Level 3. Very few are Level 4. When you see "encrypted" on a product page, your first job is to figure out which level they mean.
Four distinct architectural tiers exist. The gap between them is enormous, and most vendor claims conflate Level 2 with Level 4.
TLS between client and server. Prevents network eavesdropping. Server receives plaintext. This is table stakes: every HTTPS endpoint has it. Claiming "encrypted" based solely on TLS is like claiming "tested" because the compiler ran.
Data encrypted on disk with vendor-managed keys. AWS S3, GCS, Azure Blob, Dropbox, and most SaaS products operate here. Protects against physical disk theft. Does not protect against the vendor, their employees, compromised infrastructure, or compelled disclosure. The vendor can decrypt at will.
Customer-managed keys with vendor-side encryption/decryption. The key transits vendor infrastructure during operations. Trust is reduced (customer has revocation), not eliminated. The vendor processes plaintext in memory. Architecturally, the vendor CAN retain the key even if policy says otherwise.
Encryption before data leaves the client. Vendor receives and stores only ciphertext. Key never touches vendor infrastructure. Compromise of the vendor yields binary noise. This is the only level where "encrypted" survives an adversarial threat model that includes the vendor.
Infrastructure parallel: most managed vector stores (Pinecone, Weaviate Cloud) operate at Level 2. Your embeddings, which encode semantic meaning from proprietary documents, sit on vendor infrastructure encrypted with vendor keys. The vendor can reconstruct meaning from those vectors.
"Private" does not mean encrypted
This is the confusion that catches the most developers. A "private repository" on GitHub or GitLab is not encrypted. It is access-controlled. There is a difference.
Access control means: only people with the right permissions can see it. The platform decides who has the right permissions. The platform itself can always see it. GitHub can read every private repository on their platform. They have to. Their features depend on it: code search, Copilot training data, security scanning, dependency graphs. These features require reading the plaintext code.
"Private" means the public cannot see it. It does not mean the platform cannot see it. It does not mean employees cannot see it. It does not mean a breach would not expose it.
Access control is not encryption. A locked door keeps out guests. Encryption makes the contents unreadable even to someone standing inside the room. These are different things.
This distinction matters for compliance, for intellectual property protection, and for any threat model that includes the platform itself. If your security depends on a private repo being unreadable, you are depending on something that is not true.
This is the one that trips up the most people. When you set a project to "private" on GitHub or GitLab, it is not encrypted. It is access-controlled. Those are different things.
Think of it this way
"Private" means the public cannot see your project. But GitHub can still read every file. They have to: code search, Copilot, security scanning, and dependency graphs all require reading your actual code.
The difference
A locked door keeps out guests. Encryption makes the contents unreadable even to someone standing inside the room. "Private" is a locked door. Your files are still readable by the platform, employees, and anyone who breaches the platform.
If your project contains anything sensitive (client work, trade secrets, credentials), "private" is not enough. You need actual encryption where the platform cannot read your files even if it wanted to.
This is the most common conflation in developer tooling security. "Private" on GitHub/GitLab is an authorization boundary, not a cryptographic one. The platform retains full read access to plaintext.
GitHub's own feature set proves the point. Code search, Copilot training data ingestion, security scanning, dependency graph analysis: all of these require reading plaintext source code. The platform must have read access to deliver these features. "Private" restricts public visibility. It does not restrict platform visibility.
For AI engineers: your agent system prompts, MCP server configurations, and RAG pipeline schemas stored in "private" repositories are readable by the platform. If those contain proprietary reasoning chains or competitive differentiators, "private" does not protect them from platform-level access, breaches, or compelled disclosure.
What AES-256-GCM actually provides
When security-focused products talk about their encryption, you will often see "AES-256-GCM." Here is what each part of that means in plain language.
AES stands for Advanced Encryption Standard. It is the most widely used symmetric encryption algorithm in the world. "Symmetric" means the same key encrypts and decrypts. The US government uses AES. Banks use AES. Your phone uses AES. It has been studied by thousands of cryptographers over more than 25 years. Nobody has found a practical way to break it.
256 is the key length in bits. A 256-bit key means there are 2^256 possible keys. That number is larger than the estimated number of atoms in the observable universe. Brute-forcing a 256-bit key is not a realistic attack, even with hardware that does not exist yet.
GCM stands for Galois/Counter Mode. This is the part that elevates AES from "just encryption" to "authenticated encryption." GCM provides three things at once:
- Confidentiality: Nobody can read the data without the key.
- Integrity: Nobody can change the encrypted data without detection. If even one bit is flipped, decryption will fail.
- Authentication: The decrypted data is proven to be exactly what was encrypted. You know it came from someone who held the key, and you know it was not altered in transit or storage.
The combination of all three is called AEAD: Authenticated Encryption with Associated Data. AEAD is the current standard for how encryption should work. If a product uses AES without an authentication mode (like plain AES-CBC without HMAC), that is a warning sign. Encryption without authentication means someone could tamper with your ciphertext and you would not know until the decrypted output made no sense, or worse, silently produced wrong data.
// Without authentication (AES-CBC alone):
Encrypt(key, plaintext) => ciphertext
// An attacker can flip bits in ciphertext.
// Decryption "succeeds" but produces garbage or worse.
// With authentication (AES-256-GCM):
Encrypt(key, nonce, plaintext) => ciphertext + auth_tag
// An attacker flips a bit in ciphertext.
// Decryption REJECTS the data immediately.
// You know it was tampered with. Nothing silently corrupts.
One detail matters for correctness: the nonce (a random number used once per encryption operation). AES-256-GCM requires a unique nonce for every encryption. If the same nonce is ever reused with the same key, the security guarantees collapse. This is not an obscure edge case. It is a hard requirement. Any correctly built system uses a fresh random nonce for each encryption and never reuses one.
You will see "AES-256-GCM" on security-focused product pages. Here is what it means in plain language.
Breaking it down
AES = the scrambling method. Used by governments, banks, and your phone. Studied for 25+ years. Nobody has cracked it.
256 = the key size. There are more possible keys than atoms in the universe. Guessing the key is not realistic, ever.
GCM = the mode that adds tamper detection. If anyone changes even one tiny piece of your scrambled file, the system catches it and rejects the file immediately.
Together, these three pieces give you three protections: nobody can read it, nobody can change it, and you can prove the file is exactly what was originally scrambled.
What to look for
If a product says "AES-256-GCM" or "AEAD," that is a good sign: they are using modern, authenticated encryption. If they say "AES" without mentioning GCM or authentication, that is a yellow flag. Encryption without tamper detection can silently produce wrong data.
AES-256-GCM is the cipher most security-focused tools cite. Here is what each component provides and where the actual risk lives.
AES-256: Symmetric block cipher, 256-bit key, hardware-accelerated via AES-NI on modern CPUs. No practical cryptanalytic break exists.
GCM (Galois/Counter Mode): Provides AEAD (authenticated encryption with associated data). Three properties: confidentiality (unreadable without key), integrity (tamper detection), and authentication (proof of origin). This is the current standard. AES-CBC without HMAC is a known weakness: it allows ciphertext manipulation without detection.
The critical implementation detail is nonce management. AES-256-GCM requires a unique 96-bit nonce per encryption operation under the same key. Nonce reuse with the same key catastrophically breaks both confidentiality and authentication. This is not a theoretical concern. It is a hard constraint that distinguishes correct implementations from vulnerable ones.
For AI pipelines: if you are encrypting model weights, embedding caches, or agent state at rest, AES-256-GCM is the correct primitive. But the cipher selection is the solved problem. The unsolved problem is where the key lives. That is the next section.
Key management matters more than the cipher
Most conversations about encryption focus on the algorithm: AES-256, ChaCha20, RSA. But the algorithm is the solved problem. AES-256-GCM is well understood, heavily studied, and implemented in hardware on most modern CPUs. It is not the weak link.
The weak link is always key management. Where is the key stored? How is it derived? Who has access to it? What happens when it needs to rotate? What happens when the user loses it?
Consider two products that both use AES-256-GCM:
- Product A stores your encryption key in a database on the same server as your encrypted data. The key is protected by access controls, but if the server is compromised, the attacker gets both the ciphertext and the key. Game over.
- Product B derives your encryption key on your local machine from a master secret that never leaves the device. The server only ever sees ciphertext. If the server is compromised, the attacker gets binary noise with no way to decrypt it.
Both products use the same cipher. Both can truthfully say "AES-256-GCM encrypted." But they have completely different security properties. The cipher is the same. The key management is what separates them.
"We use AES-256" tells you the lock is strong. It tells you nothing about where the key is hanging.
Good key management has a few characteristics worth looking for:
- Key derivation: Individual encryption keys derive from a master secret, not used directly. If one derived key is compromised, the other keys remain safe. A standard key derivation function handles this automatically.
- Key isolation: The key never exists on the same system as the encrypted data unless that system is under the user's control. Storing the key next to the ciphertext is like locking the door and leaving the key under the mat.
- Key rotation: Keys rotate without re-downloading and re-uploading all encrypted data. Good architecture plans for this from the start.
- Key loss: If the user loses the key, the data is gone. This sounds scary, but it is actually the proof that the system works. If the vendor has a "recover your key" option, they have a copy of your key. That puts you back at Level 2.
The acid test for key management: If the vendor's entire infrastructure is compromised, can they read your data? If the answer is yes, the key management is the weak link, not the cipher.
Most security conversations focus on the scrambling method. But the scrambling method is the solved problem. What actually matters is: where is the key?
Two tools, same encryption, completely different safety
Tool A stores your key on the same server as your scrambled files. If the server gets hacked, the attacker gets both the lock and the key. Everything is exposed.
Tool B creates your key on your computer, and it never leaves. The server only ever sees scrambled data. If the server gets hacked, the attacker gets noise.
Both tools can honestly say "AES-256-GCM encrypted." But one protects you and the other does not. The scrambling method is the same. The key location is what matters.
Good signs to look for
The key stays on your machine. Each project gets its own key derived from a master key. You can change keys without re-uploading everything. And here is the big one: if you lose the key, the data is gone forever. That sounds scary, but it is proof the system works. If the vendor can "recover your key," it means they have a copy.
The cipher is the solved problem. AES-256-GCM is well-understood, hardware-accelerated, and not the weak link in any real-world breach. Key management is where systems fail.
Product A: key stored in database co-located with ciphertext. Server compromise yields both. Product B: key derived client-side from master secret that never leaves the device. Server compromise yields binary noise. Same cipher. Same marketing claim. Completely different threat surface.
Four properties distinguish sound key management:
Key derivation: individual keys derived from master secret via KDF, not used directly. Compromise of one derived key does not propagate.
Key isolation: key material never co-located with ciphertext on vendor infrastructure.
Key rotation: re-keying without re-uploading all encrypted data. Architecture must plan for this from day one.
Irrecoverable loss: if the vendor can recover your key, the vendor has your key. Key recovery is proof of Level 2, not a feature.
Acid test: if the vendor's entire infrastructure is compromised, can they read your data? If yes, the key management architecture is the vulnerability, regardless of what cipher is in the marketing copy.
The metadata problem
Even with Level 4 client-side encryption, there is a category of information that is hard to protect: metadata.
Metadata is the data about the data. File names. File sizes. Timestamps. Access patterns. How often you upload. When you upload. How large each upload is. The structure of your directory tree. The number of files. Who you share with.
Encrypting the payload (the file contents) does not automatically encrypt the metadata. And metadata can reveal a surprising amount of information even when the contents are completely opaque.
Think about it this way. An encrypted letter in a sealed envelope tells an observer nothing about the message. But the envelope itself tells them: who sent it, who received it, when it was sent, how heavy it is, and where it was mailed from. If they watch the envelopes long enough, they can map your entire social network without ever opening one.
In the context of encrypted cloud storage:
- File names can reveal project names, client names, or internal codenames. If your encrypted files are stored as
acme-merger-contract.enc, the encryption did not help much. - File sizes can reveal file types. A 4 KB file is probably a config. A 500 MB file is probably a database dump or a video.
- Timestamps can reveal work patterns, deployment schedules, and incident response timelines.
- Access patterns can reveal which files are most important based on how often they are accessed or updated.
Good encryption systems address metadata in various ways: padding files to hide true sizes, using opaque identifiers instead of human-readable file names, batching operations to obscure access patterns. None of these are perfect. Metadata protection is a spectrum, not a switch. But a system that encrypts the payload and leaves file names in plaintext has a gap worth knowing about.
When evaluating an encrypted product, ask about metadata. "Is the data encrypted?" and "Is the metadata encrypted?" are two different questions with potentially different answers.
Even with the best encryption, there is a sneaky category of information that can still leak: metadata. That is the data about your data.
What metadata reveals
File names, file sizes, timestamps, how often you upload, when you upload, how big each upload is. Even if nobody can read the actual contents, this information tells a story.
Think of a sealed envelope. Nobody can read the letter inside, but the envelope itself shows who sent it, who received it, when, how heavy it is, and where it was mailed from. Watch the envelopes long enough and you can map someone's entire network without ever opening one.
Watch out for this
If your encrypted files are stored with readable names like "client-merger-contract.enc," the encryption did not help much. A 4 KB file is probably a config. A 500 MB file is probably a database. Timestamps show when you work and when you ship. All of this leaks without anyone reading a single file.
Good encryption tools address this with random-looking file names, size padding, and batched uploads. No solution is perfect, but "is the metadata also protected?" is a question worth asking.
Client-side encryption solves payload confidentiality. It does not automatically solve metadata leakage, and metadata can be operationally equivalent to plaintext in many threat models.
File names reveal project identifiers and client names. File sizes distinguish configs from database dumps. Timestamps expose work patterns, deployment cadence, and incident response timelines. Access frequency reveals priority and sensitivity. An adversary with metadata access can reconstruct organizational structure without decrypting a single byte.
AI infrastructure parallel: if your vector store uses human-readable collection names ("customer-support-embeddings," "internal-policy-rag"), the collection names leak system architecture even if every vector is encrypted. Opaque identifiers, size padding, and batched operations are mitigation layers. None are complete. Metadata protection is a spectrum, and "is metadata encrypted?" is a separate question from "is data encrypted?"
The one question to ask every vendor
You do not need to understand cryptographic internals to evaluate encryption claims. You need one question:
"Who holds the key?"
If the vendor holds the key, it is their encryption. It protects you from outsiders but not from the vendor. It is Level 2.
If the vendor sees the key during operations, it is shared encryption. It is better than Level 2, but the vendor can still access your data when they need to. It is Level 3.
If the key never leaves your machine, it is your encryption. The vendor is structurally locked out. Even they cannot help you if you lose the key. That is Level 4.
The follow-up questions are straightforward:
- If I lose my key, can you recover my data? (If yes: they have the key. Level 2.)
- Can you read my data for support purposes? (If yes: they have the key. Level 2.)
- If you receive a subpoena, can you hand over the plaintext?
(If yes: they have the key. Level 2.) - If your entire infrastructure is compromised, is my data safe? (If no: the key is on their infrastructure somewhere.)
These questions are not adversarial. They are clarifying. Most vendors will answer honestly because the architecture either supports the claim or it does not. Encryption properties are not ambiguous. They are mathematical. The key is either on the vendor's system or it is not.
The word "encrypted" on a product page is the starting point of the conversation, not the end of it. Now you know what to ask next.
You do not need to understand any of the technical details to evaluate an encryption claim. You need one question:
"Who holds the key?"
That is it. One question. The answer tells you everything.
Follow-up questions that reveal the truth
"If I lose my key, can you recover my files?" If yes, they have the key.
"Can you read my files for support?" If yes, they have the key.
"If you get a court order, can you hand over my stuff?" If yes, they have the key.
"If your entire system gets hacked, are my files safe?" If no, the key is on their system somewhere.
These are not aggressive questions. They are clarifying. The architecture either supports the claim or it does not. "Encrypted" on a product page is the start of the conversation, not the end.
One question collapses the entire evaluation framework into a single diagnostic:
"Who holds the key?"
Vendor holds key = Level 2. Vendor sees key during operations = Level 3. Key never leaves client = Level 4. No further analysis required to classify the architecture.
Validation questions that confirm the classification:
Key recovery: "Can you recover my data if I lose the key?" Yes = vendor has key material.
Support access: "Can you read my data for debugging?" Yes = vendor has plaintext access.
Compelled disclosure: "Can you comply with a subpoena for plaintext?" Yes = vendor holds decryption capability.
Infrastructure compromise: "If your entire stack is breached, is my data safe?" No = key co-located with ciphertext.
Apply this to your own stack. Your LLM provider, your vector store, your agent orchestration platform: ask each one who holds the key. The answer determines whether "encrypted" is an architectural property or a marketing signal.
Frequently asked questions
What are the four levels of encryption for cloud storage?
Level 1 is encrypted in transit only (TLS), which protects data between your machine and the server but leaves it readable on the server. Level 2 is encrypted at rest with vendor-managed keys, where the provider encrypts data on disk but holds the keys and can decrypt it. Level 3 is client-side encryption with vendor key escrow, where you encrypt locally but the vendor holds a recovery copy of the key. Level 4 is true client-side encryption with user-held keys, where the key never leaves your machine and the vendor mathematically cannot read your data.
Why does a private GitHub repository not count as encrypted?
A private GitHub repository restricts access through authentication and authorization, but the data itself is not encrypted with a key you control. GitHub employees, infrastructure processes, and anyone who compromises GitHub's systems can read the plaintext of your code. Private means access-controlled. Encrypted means mathematically unreadable without the key. These are different properties, and only encryption protects against insider threats, breaches, and compelled disclosure.
What question should I ask every vendor who claims their product is encrypted?
Ask: who holds the encryption key? If the vendor generates it, stores it, or can recover it, then they can read your data regardless of what cipher they use. The only encryption that protects you from the vendor itself is encryption where the key exists solely on your hardware. Follow up by asking: if you are compelled by a court order, can you hand over my data in readable form? If the answer is yes, the key is on their infrastructure.
What are the four levels of encryption?
Level 1: scrambled while traveling to the server, readable once it arrives. Level 2: scrambled on the server's disk, but the company holds the key and can read it anytime. Level 3: you bring your own key, but the company still sees it during use. Level 4: scrambled on your computer before it goes anywhere. The company never sees the key and cannot read your files.
Does setting my project to "private" mean it is encrypted?
No. "Private" means the public cannot see your project, but the platform (GitHub, GitLab) can still read every file. They need to for features like search and security scanning. Private is a locked door. Encryption makes the contents unreadable even to someone standing inside the room. Only encryption protects against the platform itself, employee access, or a data breach.
What is the one question I should ask any tool that says "encrypted"?
"Who holds the key?" If the company holds it, they can read your files. If the company can recover your key when you lose it, they have a copy. If a court order can make them hand over readable files, the key is on their system. The only encryption that fully protects you is when the key never leaves your computer.
How do the four encryption levels map to AI infrastructure?
Level 1 (TLS) covers API calls to LLM providers: data is protected in transit but readable at the endpoint. Level 2 (vendor-managed keys) is where most managed vector stores, model hosting, and agent platforms operate: data encrypted on disk with vendor keys, vendor retains read access. Level 3 (BYOK) is offered by some enterprise cloud providers for model storage but the key transits their infrastructure. Level 4 (client-side) means encrypting embeddings, configs, and agent state before it reaches any vendor. Most AI infrastructure operates at Level 2.
Why is a "private" repository insufficient for AI agent configurations?
"Private" is an authorization boundary, not a cryptographic one. The platform retains plaintext read access. For agent system prompts, MCP server configs, and RAG pipeline schemas that contain proprietary reasoning chains or competitive differentiators, private visibility controls do not protect against platform-level access, infrastructure breaches, or compelled disclosure. Only client-side encryption with user-held keys provides that protection.
What single question classifies a vendor's encryption architecture?
"Who holds the key?" This single diagnostic classifies the architecture. Vendor holds key = Level 2. Vendor sees key during operations = Level 3. Key never leaves client = Level 4. Validate with: "Can you comply with a subpoena for plaintext?" If yes, the vendor holds decryption capability regardless of what cipher they advertise. Apply this to every vendor in your AI stack: LLM provider, vector store, orchestration platform, monitoring service.