Cryptography is the science and study of taking data and making it hidden so others cannot understand it.
Cryptography provides confidentiality by using obfuscation. There are a lot of ways to use obfuscation to provide confidentiality.
- Encryption: the process of obfuscating data
- Decryption: the process of unobfuscating data.
- Caesar cipher: a common that shifts letters in the alphabet by some number. ROT2 would replace instances of the letter A with the letter B. ROT3 would replace the instances of the letter A with the letter C. Very weak form of encryption.
- Cryptanalysis: the process of breaking down an encrypted message to understand how to decrypt it.
- Algorithm: They should be known to everyone.
- Key: A variable secret.
Exclusive Or (XOR)
Ciphers such as the Caesar Cipher work well with alphabet/plaintext, however, it doesn’t work well with binary data. One example which does work well with binary data is XOR. This logic is fundamental to many(most?) common encryption algorithms.
XOR is a logic gate function that outputs a 1 when the number of inputs is odd (in this case, only if one input is 1).
Exclusive OR truth table
My name STEVEN encoded in ASCII, converted to binary.
- S - 01110011
- T - 01110100
- E - 01100101
- V - 01110110
- E - 01100101
- N - 01101110
Simply XOR the Key with the Plain text as inputs
Imagine the key is 11010
|Key (repeated)||1101011 01011010 11010110 10110101 10101101 01101011|
|Plain Text||0111011 01110100 01100101 01110110 01100101 01101110|
|Cipher Text||1010000 00101110 10110011 11000011 11001000 00000101|
Cipher Text is the output, it’s the encrypted version of the data. To decrypt the data and turn ciphertext back into plain text, simply XOR the ciphertext with the key again. Because the key is secret, you need the same key to decrypt the data using XOR. If you don’t have the exact key, the result of decrypting will be different.
This XOR logic provides an interesting property known commonly as Kerckhoff’s Principle.
If you don’t know what the key is to the encryption, you are still able to understand the algorithm completely without being able to make sense of the data. Most encryption algorithms are open. It enables us to share and research algorithms with intense scrutiny without compromising security.
Types of Data to be encrypted
- Data at rest: an example of data at rest, is data stored on a hard drive
- Data in transit: an example of data in transit is transmitting data over a network
- Data in process: data in process is data which is actively being worked on and manipulated by an application
Types of encryption
- Symmetric encryption: the same key is used for encryption and decryption. It’s easy to send the encrypted information from one party to another, but not easy to share the key. In-band key transfer is when the key is sent with the encrypted data. If Kerkhoff’s principle applies, this is unsafe. Out of band key transfer is when the key is not sent with the encrypted data. Symmetric encryption is the primary way we encrypt data.
- Asymmetric encryption: uses a key pair (public/private keys). Two keys are generated by Alice, the private key never leaves Alice. The public key may be given to anyone. The public key is only used to encrypt and the private key is only used to decrypt. Bob can use Alice’s public key and then only Alice can decrypt it. To provide two-way communication with asymmetric encryption, Bob must generate his own key pair and send the public key to Alice. Asymmetric encryption works well and is secure, but there is a lot of overhead, and it’s much slower compared to symmetric. We typically use asymmetric encryption to exchange a session/ephemeral key for symmetric encryption, then switch to that.
- Ephemeral Key: A key that is used temporarily and not used over long periods of time. Ephemeral keys provide perfect forward secrecy.
- Cryptosystem: specifies a process that programs can use to make cryptography work for them, specifies things like key size, message exchanges, etc.
- Block: chunk of data, of a specific size in bits
Important Properties of Crypto Systems
- Number of rounds:
- Block Size:
Examples of cryptosystems
Computing started around the 1940s. The idea of encrypting a computer didn’t make sense until around 1970.
You need to know the block size, number of rounds, and the key size of each of these symmetric block ciphers for the exam.
|Block Cipher Name||Block Size||Number of Rounds||Key Size|
|DES||64-bit||16 rounds||56 bits|
|3DES||64-bit||16 rounds||56 bits * 3 = 168 bits|
|Blowfish||64-bit||16 rounds||32-448 bits|
|AES||128-bit||10, 12, or 14 depending on key size||128, 192, or 256 bits|
DES wasn’t good, blowfish wasn’t under government control and 3DES was only a patch job. In the late 1990s, NIST started a competition to narrow down
An algorithm called “Rain Doll” was the winner and became what we now know as Advanced Encryption Standard (AES). Since then, it’s been the defacto standard for modern-day symmetric encryption (this was written in the year 2020). AES is a US government encryption standard, supported by NIST.
Instead of taking blocks, as each bit comes through the stream, it can encrypt a single bit at a time. There is only one common streaming cipher in use today (exam knowledge - but certainly untrue).
There is only one type
- RC4 (Rivest Cipher 4): streaming cipher, 1-bit block size, 1 round, key size 40-2048 bits
Symmetric Block Mode Encryption
Block encryption has a problem. If you start running into similar blocks, the output will be the same. This doesn’t work for things with repeating patterns such as images. This is called ECB or electronic codebook. ECB block modes will always output the same results with the same input.
In order to get proper diffusion, flipping 1 bit anywhere in the plaintext data should result in a completely different output for the whole file. With ECB, because the diffusion is happening at the block level, this doesn’t happen and so a lot of patterns within the image remain present. When the encryption happens, it pulls in one block of data and for awhile it’s an all-white background, so all the white background blocks will still be the same color, same for the black body of Tux.
We don’t use ECB mode, we use block modes to prevent this. These block modes help us get around the limitation mentioned above with the image and will instead output truly random data, even if one block is the exact same as another block.
- Cipher Block Chaining (CBC): First block is XOR’d on initialization vector. The encrypted output is used as the key for the next block
- Cipher Feedback (CFB): Encrypt the initialization vector, then take output and XOR with the first block. Then proceed like CBC
- Output Feedback: Encrypt initialization vector, then take output and XOR with the first block. Move on to the next block, but keep using the same initialization vector for subsequent blocks
- Counter (CTR): NONCE value is used with a counter value, they get combined, then encrypted. Use this to encrypt the first block, then increment the counter. Repeat for subsequent blocks.
Asymmetric encryption allows us to safely transmit a public key in the clear to another party. When data is encrypted with the public key, only the private key can decrypt it. The private key should never be transmitted if at all possible.
- RSA (Rivest Shamir Adleman): one of the first public-key cryptosystems and widely used for secure data transmission. One problem with RSA is that key sizes end up getting very large (4096 bits)
- ECC (Elliptic Curve Cryptography): one of the new asymmetric algorithms. Allows very small keys with the same robustness as RSA keys. A 3072 bit RSA key can be replaced with a 256 bit ECC key. It’s only now starting to become widely popular (2020).
Uses asymmetric encryption to exchange a session key to transition the communication channel to symmetric encryption. It’s a process to allow two parties to securely agree upon an ephemeral key, it is not an encryption algorithm itself. This is called a key exchange protocol.
Diffie-Hellman Groups help define standards for the size or type of key structure to use.
|Group Name||Group Value|
|Group 1||768-bit modulus|
|Group 2||1024-bit modulus|
|Group 5||1536-bit modulus|
|Group 14||2048-bit modulus|
|Group 19||256-bit elliptic curve|
|Group 20||384-bit elliptic curve|
|Group 21||521-bit elliptic curve|
PGP / GPG
Pretty Good Privacy has been around for over 25 years. PGP was originally invented for email encryption. Can be used to encrypt files, sign files, or even full disk encryption.
PGP generates a random key, then encrypts the data using that random key. Then they encrypt the key using the receiver’s public key. This all then gets sent to the receiver.
PGP decryption separates the encrypted key from the data, then decrypts the message key using the private key. This can then be used to decrypt the data into plaintext.
Web of trust - similar to PKI in a way but instead it’s decentralized.
- Symantec Corporation will encrypt mass storage, signing, disk encryption, BitLocker, FileVault, enterprise cloud solutions, is proprietary, not free
- OpenPGP is free, mostly does encrypted email, PKI support, S/MIME
- GPG (GNU Privacy Guard) is a free toolset based on OpenPGP. GPG will do file and disk encryption as well
Remember CIA - cryptography isn’t only about the confidentiality, but also integrity.
A hash is an algorithm that will take an arbitrary amount of data and outputs a fixed-size value derived from the input. Hashes are one-way function, there is no way to reverse it, unlike encryption. Hashes are also deterministic, the same input with the same hash function will always return the same output. Slight changes in input will result in wildly different outputs.
- Message Digest 5 (MD5): Invented in 1992, invented by Ron Rivest, uses a 128-bit hash.
- Secure Hash Algorithm (SHA): Developed by the national institute of standards (NIS). SHA1 has a 160-bit hash. SHA2 can be 224, 256, 384, 512, 512/224, and 512/256.
- RIPEMD: Not common, open standard, nothing wrong but just not common. Comes in 128, 160, 256, and 320-bit digests.
Both MD5 and SHA1 have known collisions.
- Collision: when two distinct inputs run through a hash function create the same output
- Digest: the output of a hashing function
An example use of hashing is to store passwords. Typically we do not store the password itself in plaintext but will store a hash of the password.
HMAC stands for Hash-based Message Authentication Code. It provides authentication to determine a packet that has come from a specific station. Take one packet, add the key, then hash the packet. Requires both sender and receiver to have the same key.
The process of taking data and hiding it in other data. For example, embedding data inside an image file.
Certificates and Trust
Generate public and private keys. The problem with asymmetric encryption is you don’t know if the public key comes from the person they say they are. Symmetric encryption doesn’t have this problem because it is assumed authentication occurs at the time which the key is agreed upon.
The web server can prove it has a matching private key for its public key by sending the data by encrypting it using the private key and hashes it, then sends it to the receiver with the public key. The receiver can decrypt the data, hash it, and compare the hashes to ensure the server holds the matching private key.
- Digital Signature: A hash of an encrypted chunk of data which proves the sender holds the private key matching the public key sent in the message.
This digital signature process can be repeated to sign another party’s key, which allows us to validate the identity based on parties we trust. All of this can be bundled into something called a digital certificate.
Trusted third parties are referred to as Certificate Authorities
A hierarchical method, it’s the most widely used implementation of digital trust in the world. PKI is about more than trust. It’s about distribution, control, maintenance, revocation, and more.
At the top is the Certificate Authority (CA). Intermediate certificate authorities are a way to distribute the load of signing certificates to multiple other servers. The top-level or root CA should be kept offline as per best practice. This line of trust is called the certification path.
PKI is not a standard, it is a concept based on X.509.
PKCS is the way RSA does things and has become the defacto standard. Public Key Cryptography Standards. It is a group of standards. PKCS#7 is a standard that is used to store a certificate as a file. PKCS#7 uses the P7B file extension. PKCS#12 is a standard that is used to store private keys accompanying public key certificates. PKCS#12 uses the PFX file extension.
- CRL: Certification Revocation List. Provides a path for the client to find a revocation list for the certificate. The list contains the thumbprints of the revoked certificate along with a reason they were revoked. CRLs may take up to 24 hours to react.
- OCSP: online Certificate Status Protocol addresses the shortfalls of CRLs because it can check in near real-time whether a certificate is revoked or not.