tinygpu

TinyGPU πŸ‰βš‘ β€” v2.0.0

Release v2.0.0 Python 3.13 License: MIT CI Tests Code Style: Black

TinyGPU is a tiny educational GPU simulator β€” a minimal SIMT-style simulator with:

πŸŽ“ Built for learning and visualization - see how threads, registers, and memory interact across cycles!


πŸš€ What’s New in v2.0.0


Quick Screenshots / Demos

Odd–Even Transposition Sort

Odd-Even Sort

Parallel Reduction (Sum)

Reduce Sum


Getting Started

Clone and install (editable):

git clone https://github.com/deaneeth/tinygpu.git
cd tinygpu
pip install -e .
pip install -r requirements-dev.txt

Run a demo (odd-even sort):

python -m examples.run_odd_even_sort

Produces: outputs/run_odd_even_sort/run_odd_even_sort_*.gif β€” a visual GPU-style sorting process.


Examples & Runners


Instruction Set (Quick Reference)

Instruction Operands Description
SET Rd, imm Rd = destination register, imm = immediate value Set register Rd to an immediate constant.
ADD Rd, Ra, Rb Rd = destination, Ra + Rb Add two registers and store result in Rd.
ADD Rd, Ra, imm Rd = destination, Ra + immediate Add register and immediate value.
MUL Rd, Ra, Rb Multiply two registers. Rd = Ra * Rb
MUL Rd, Ra, imm Multiply register by immediate. Rd = Ra * imm
LD Rd, addr Load from memory address into register. Rd = mem[addr]
LD Rd, Rk Load from address in register Rk. Rd = mem[Rk]
ST addr, Rs Store register into memory address. mem[addr] = Rs
ST Rk, Rs Store value from Rs into memory at address in register Rk. mem[Rk] = Rs
SHLD Rd, saddr Load from shared memory into register. Rd = shared_mem[saddr]
SHST saddr, Rs Store register into shared memory. shared_mem[saddr] = Rs
CSWAP addrA, addrB Compare-and-swap memory values. If mem[addrA] > mem[addrB], swap them. Used for sorting.
CMP Ra, Rb Compare and set flags. Set Z/N/G flags based on Ra - Rb.
BRGT target Branch if greater. Jump to target if G flag set.
BRLT target Branch if less. Jump to target if N flag set.
BRZ target Branch if zero. Jump to target if Z flag set.
JMP target Label or immediate. Unconditional jump β€” sets PC to target.
SYNC (no operands) Global synchronization barrier β€” all threads must reach this point.
SYNCB (no operands) Block-level synchronization barrier.

Publishing & Contributing


License

MIT β€” See LICENSE.