Performance Benchmarks
Nebula is engineered for high-performance execution.
Executive Summary
| Benchmark | Nebula | Python 3.11 | Speedup |
|---|---|---|---|
| Fibonacci(28) | 0.05s | 0.20s | 4x |
| Sum Loop (1M) | 0.15s | 0.45s | 3x |
| String Concat | 0.08s | 0.12s | 1.5x |
| Matrix Math | 0.22s | 0.65s | 3x |
Methodology
All benchmarks run on:
- CPU: AMD Ryzen 7 5800X
- RAM: 32GB DDR4
- OS: Windows 11 / Linux (Ubuntu 22.04)
- Nebula: v1.0.0 (--vm mode)
- Python: 3.11.4
Each test run 10 times, median reported.
Detailed Benchmarks
Fibonacci (Recursive)
Classic recursive implementation:
fn fib(n) do
if n <= 1 do
give n
end
give fib(n - 1) + fib(n - 2)
end
log(fib(28))
| n | Nebula | Python | Speedup |
|---|---|---|---|
| 25 | 0.01s | 0.05s | 5x |
| 28 | 0.05s | 0.20s | 4x |
| 30 | 0.12s | 0.52s | 4.3x |
| 35 | 1.20s | 5.40s | 4.5x |
Loop Performance
Simple counter loop:
fn loop_test(n) do
sum = 0
i = 0
while i < n do
sum = sum + i
i = i + 1
end
give sum
end
log(loop_test(1000000))
| Iterations | Nebula | Python | Speedup |
|---|---|---|---|
| 100K | 0.03s | 0.10s | 3.3x |
| 1M | 0.15s | 0.45s | 3x |
| 10M | 1.50s | 4.80s | 3.2x |
String Operations
String concatenation and manipulation:
fn string_test() do
result = ""
for i = 0, 10000 do
result = result + "x"
end
give len(result)
end
| Operations | Nebula | Python | Note |
|---|---|---|---|
| 10K concat | 0.08s | 0.12s | String interning helps |
Constant Folding
Static math expressions:
# These are computed at compile time
perm result = 2 + 3 * 4 - 1
perm circle = 3.14159 * 10 * 10
| Expression | Nebula | Python |
|---|---|---|
| Static math | 0.00s | 0.01s |
The Nebula compiler folds constant expressions, so runtime cost is zero.
Why Nebula is Fast
1. NanBoxing
All values fit in 64 bits. No heap allocation for primitives.
Number: [64-bit IEEE 754 float]
Integer: [NaN-tagged 48-bit integer]
Boolean: [NaN-tagged single bit]
Pointer: [NaN-tagged 48-bit address]
2. Global Indexing
Variables are array indices, not hash lookups:
# Source
x = 10
# VM sees
STORE_GLOBAL_0 # Direct array access
3. String Interning
String comparison is O(1):
perm a = "hello"
perm b = "hello"
log(a == b) # Pointer comparison, instant
4. Peephole Optimization
Redundant bytecode is eliminated:
LOAD_CONST 1 → LOAD_CONST 3
LOAD_CONST 2
ADD
5. Specialized Instructions
Common patterns have dedicated opcodes:
LOAD_LOCAL 0 → LOAD_LOCAL_0 (single byte)
INC_LOCAL 0 (increment without load/store)
Running Your Own Benchmarks
fn benchmark(name, iterations, func) do
start = clock()
i = 0
while i < iterations do
func()
i = i + 1
end
elapsed = clock() - start
log(name, ":", elapsed, "ms")
end
benchmark("fib", 1000, fn() = fib(20))
Comparison with Other Languages
| Language | Fib(30) | Relative |
|---|---|---|
| C (gcc -O3) | 0.01s | 1x |
| Rust | 0.01s | 1x |
| Nebula | 0.12s | 12x |
| Lua | 0.18s | 18x |
| Python | 0.52s | 52x |
| Ruby | 0.65s | 65x |
Nebula is significantly faster than Python while maintaining similar ease of use.