Private...but how much private?

April 22, 2026cpp / cplusplus / systemsprogramming / softwareengineering / computerscience / memorymodel / lowlevel / programming / coding

Private Variables Are a Suggestion

Private...but how much Private?

The JavaScript Girl and My Private Variable

(Note: The JavaScript girl is mentioned in earlier articles and will keep making appearances in future articles too!)

In college, the JavaScript girl was my crush. Her best friend was my best friend too. Naturally, I confided in her best friend, treating my feelings like a perfectly encapsulated private variable. I thought my secret was safe.

I was wrong. Just like in C++, I learned the hard way that what you think is private is often just publicly accessible through the back door.

Let's explore why privacy in C++ is just an illusion.

The Illusion of Encapsulation

My favorite professor, Sanjeet Sir, teaching the private access specifier

In Object-Oriented Programming, we are frequently told that encapsulation protects our data and that private variables are inherently safe. It sounds comforting, making it feel like the language is actively guarding your object.

While that might hold true in managed languages like Java or C#, C++ operates on an entirely different philosophy: it will not stop you from causing damage. If you break it, the responsibility falls entirely on you.

The harsh reality? In C++, private is not a runtime security guard. It is merely a compile-time suggestion. It means you aren't allowed to access the variable politely through the source code. It absolutely does not prevent the underlying memory from being violently taken over.

Once the program is running, memory is just memory. The compiler complains, but the CPU? The CPU doesn't care about your feelings.

Breaking the Abstraction: A Systems Programmer's Playground

Consider this simple class:

#include <iostream>
using namespace std;

class Secret {
private:
    int x = 10;

public:
    void print() {
        cout << "x = " << x << endl;
    }
};

By all standard rules, writing obj.x = 100; throws a strict compilation error. You feel safe. But let's see what happens when a systems programmer decides the compiler is just offering a friendly piece of advice:

int main() {
    Secret obj;
    obj.print(); // Output: x = 10

    // Breaking encapsulation
    int* ptr = (int*)&obj;
    *ptr = 999;

    obj.print(); // Output: x = 999
}

What just happened? Because an object in C++ is just a contiguous block of memory, the compiler maps the Secret object directly to an integer layout. When we cast the address of the object to an int*, we are explicitly telling the compiler: "Treat this entire object as a pointer to an integer."

There are no checks. No runtime exceptions. We write 999 directly over the memory where x resides. Encapsulation vanishes instantly.

(Technical Note: While this visually proves the concept, interpreting memory through an incompatible pointer type violates C++ Strict Aliasing Rules and officially invokes Undefined Behavior. A modern production codebase would use std::memcpy or C++20's std::bit_cast for safe type-punning—but the core vulnerability to offset manipulation remains exactly the same.)

Walking Memory Without Permission

The previous example works flawlessly because there is a single data member, no memory padding, and a highly predictable layout. But the same principles apply when dealing with more complex scenarios. You can literally walk through object memory like an array:

Multiple Variables via Offset Guessing

class Secret {
private:
    int a = 1;
    int b = 2;
    int c = 3;

public:
    void print() {
        cout << a << " " << b << " " << c << endl;
    }
};

int main() {
    Secret obj;
    int* ptr = (int*)&obj;

    // Modifying 'b' by moving one integer offset forward
    ptr[1] = 999; 

    obj.print(); // Output: 1 999 3
}

Padding and Alignment Complications

Memory layouts aren't always straightforward. When you mix data types, compilers inject padding to align memory for hardware efficiency.

class Secret {
private:
    char a = 'A';
    int x = 10;
};

In this case, a char takes 1 byte, but the compiler adds 3 bytes of padding before the int to ensure optimal alignment. To modify x, your pointer logic must explicitly account for this 4-byte total offset: int* x_ptr = (int*)(ptr + 4);.

Using reinterpret_cast or memcpy

If raw C-style casts feel messy, modern C++ offers more explicit variants, or you can leverage raw block copying. Both achieve the exact same dangerous outcome:

// Method A: Modern C++ cast
int* ptr = reinterpret_cast<int*>(&obj);
ptr[0] = 500;

// Method B: Raw memory block copying
int val = 777;
memcpy(&obj, &val, sizeof(int));

            The Reality Check

            Relying on offset guessing is formally considered Undefined Behavior (UB). It is compiler-dependent, heavily
            reliant on the host architecture, and absolutely not meant for production code. However, utilizing these
            techniques in isolated experiments reveals the unvarnished reality of how C++ functions internally.
        

When Scope Increases: From Guessing to Calculation

It's easy to manually count bytes for a class with three variables. But what happens when your codebase spans millions of lines, abstractions are nested recursively, multiple inheritance is involved, and virtual tables enter the chat?

You can no longer just "guess" offsets. You have to transition from estimating to calculating. C++ memory layouts are not randomized; they strictly obey Application Binary Interfaces (like the Itanium C++ ABI). Because they follow rigid, deterministic rules, memory can always be mapped—and conquered.

Systems engineers analyze complex classes through structural introspection. Let's look at a complex class:

class ComplexSecret {
private:
    char a = 'Z';
    double b = 3.14;
    int target = 42;
    float d = 1.5f;

public:
    void printTarget() {
        cout << "target = " << target << endl;
    }
};

You cannot easily "guess" the offset of target here because of the mixed data types, varying alignment requirements, and compiler-injected padding. But you can navigate it logically:

Method 1: Using std::offsetof

The standard library provides offsetof() for standard-layout types, allowing developers to programmatically extract the exact byte distance of a variable from the object's origin.

#include <cstddef> // For offsetof

int main() {
    ComplexSecret obj;
    obj.printTarget(); // Output: target = 42

    // 1. Calculate precise offset
    size_t offset = offsetof(ComplexSecret, target);

    // 2. Resolve pointer to exact memory location
    char* base_ptr = (char*)&obj;
    int* target_ptr = (int*)(base_ptr + offset);

    // 3. Overwrite
    *target_ptr = 999;

    obj.printTarget(); // Output: target = 999
}

Method 2: Byte-level Reverse Engineering

If you don't even have the class definition to use offsetof, you can cast the object to an unsigned char* and iterate through it byte-by-byte to find known patterns, effectively reverse-engineering the live memory layout:

int main() {
    ComplexSecret obj;
    unsigned char* p = (unsigned char*)&obj;

    cout << "Live Memory Dump:\n";
    for(size_t i = 0; i < sizeof(obj); i++) {
        // Outputting byte at offset 'i'
        cout << "Offset " << i << " : " << (int)p[i] << "\n";
    }
    // You can identify exactly where '42' is stored, find the offset,
    // and manually inject a new value at that exact index.
}

Method 3: Debugger Inspection

Debugging tools like GDB allow a developer to pause execution and physically map the structured layout of an instantiated class, completely ignoring C++ access controls without needing a single line of C++ code.

$ gdb ./a.out
(gdb) p obj
$1 = {a = 90 'Z', b = 3.1400000000000001, target = 42, d = 1.5}

Why Break Encapsulation in the Real World?

At this point, you might be wondering: "Why would anyone ever do this intentionally?" In standard application development, you shouldn't. However, this level of memory manipulation is absolutely required for high-performance systems programming:

Serialization Frameworks: Libraries like Protocol Buffers or JSON/reflection serializers often need to read all members of an object to serialize them, bypassing encapsulation entirely without manually writing hundreds of getter methods.
Custom Memory Allocators: High-performance game engines (like Unreal Engine) manage their own clustered memory pages and use offset-based raw pointers to rapidly initialize, pool, or relocate objects in memory.
Debugging and Telemetry: Crash dump analyzers often dynamically inspect live object layouts to capture private internal state for logging, without needing to modify the crash-prone core logic.

The Only Way to Hide: The PIMPL Idiom

If private variables are just a suggestion and anyone can calculate memory offsets, how does a C++ systems programmer genuinely protect their data?

You have to hide the memory layout itself from the compiler when the client code is compiled. The most robust way to achieve this is the PIMPL (Pointer to Implementation) idiom.

Instead of declaring your private variables in the header file where everyone can see (and map) them, you forward-declare an internal structure and only store a pointer to it.

Secret.h (What the client sees)

#include <memory>

class Secret {
public:
    Secret();
    ~Secret();
    void printTarget();

private:
    struct Impl; // Forward declaration
    std::unique_ptr<Impl> pImpl; // The opaque pointer
};

Secret.cpp (The actual hidden memory)

#include "Secret.h"
#include <iostream>

// The memory layout is only known inside the .cpp file
struct Secret::Impl {
    int target = 42;
};

Secret::Secret() : pImpl(std::make_unique<Impl>()) {}
Secret::~Secret() = default;

void Secret::printTarget() {
    std::cout << "target = " << pImpl->target << std::endl;
}

Now, if someone tries to run sizeof(Secret), they only get the size of the pointer. If they try to guess offsets or use offsetof() to find internal variables, the compiler will fail. The client simply does not have the definition of Secret::Impl, meaning the layout of the actual data is completely invisible. You have successfully foiled unauthorized memory access.

The Hidden Costs of PIMPL

True encapsulation in C++ comes at a hefty price. By enforcing impenetrable privacy using PIMPL, you incur three major architectural costs:

Heap Allocation: You are now forced to allocate memory dynamically (e.g., using std::make_unique) which is vastly more expensive than allocating it on the stack.
Performance Overhead: Every single access to a private variable now requires pointer indirection. This translates to CPU cache misses, which will heavily degrade performance in critical loops.
Complex Semantics: A default unique_ptr disables copying. If you want to copy a Secret object, you must now write custom deep-copy constructors and assignment operators.

Summary: Privacy vs. Reality

Approach	Safety Level	Underlying Reality
`private` Keyword	Compile-Time (High)	Protects against accidental misuse in code, easily bypassed in memory.
Raw Pointer Casting	None (High UB Risk)	Violates Strict Aliasing; relies on padding assumptions. Only for discovery, not production.
Safe Type-Punning	Memory-Safe	Using `std::memcpy` or `std::bit_cast` prevents UB while safely extracting internal memory.
PIMPL Idiom	True Opaque Safety	Hides layout completely at compile time, but pays a real penalty in performance and allocations.

The Real Lesson

Most modern languages deliberately build high walls and guardrails to protect you from your own mistakes. C++ intentionally hands you a sledgehammer and assumes you know what you're doing.

C++ encapsulation is a discipline, not a guarantee. It gives you a heavy lock to put on your front door, but cheerfully leaves the wall next to it open for anyone willing to punch through the drywall. The language operates on one core assumption: "I trust you to respect your own abstractions."

Private variables exist strictly to communicate intent to other developers. Respect that intent, and you build clean, beautifully designed software. Break it, and you unlock absolute, terrifying low-level control.

C++ simply gives you the freedom to choose your poison.

Screenshot of my favorite professor, Sanjeet Sir, answering this at 12:54 AM midnight