Event Normalization Layer
This page explains how gem converts raw event payloads into typed game events and normalized combat log entries.
Modules covered:
src/gem/game_events.pysrc/gem/combatlog.py
Prerequisites:
- Bits & Bytes Primer
- Stream Layer (
stream.py) - Parser Layer (
parser.py) - SendTable Layer (
sendtable.py) - State Reconstruction Layer (
string_table.py+entities.py)
Why this is the next layer
After entities/state are reconstructed, parser still needs to handle semantic event channels:
- Source1 game events (
CMsgSource1LegacyGameEventList+CMsgSource1LegacyGameEvent) - Source2 combat log user messages (
CDOTAUserMsg_CombatLogBulkData) - Direct combat/chat inner messages in some streams
This layer turns those channels into stable Python objects/callbacks.
game_events.py: schema-driven typed access
Core constants
| Constant | Value | Meaning |
|---|---|---|
_TYPE_STRING | 1 | Event key is string |
_TYPE_FLOAT | 2 | Event key is float |
_TYPE_LONG | 3 | Event key is int32-long |
_TYPE_SHORT | 4 | Event key is int16 |
_TYPE_BYTE | 5 | Event key is int8 |
_TYPE_BOOL | 6 | Event key is bool |
_TYPE_UINT64 | 7 | Event key is uint64 |
These are used by typed getters (get_string, get_float, get_int32, etc.) to validate field type before returning values.
Main types
| Type | Role |
|---|---|
GameEventSchema | Event schema metadata (event_id, name, fields map) |
GameEvent | One typed event instance with accessor methods |
GameEventManager | Schema registry + handler dispatch hub |
Event flow
- Parser receives event schema list message.
GameEventManager.register_schema(...)stores event id -> schema mapping.- Parser receives raw event instance.
GameEventManager.dispatch(...)looks up schema and buildsGameEvent.- Registered handlers for that event name are called.
Why schema registration matters:
- Raw events store keys as indexed values.
- Schema converts those positional keys into named, typed fields.
combatlog.py: unify S1 + S2 combat channels
combatlog.py normalizes multiple input paths to one output model: CombatLogEntry.
Key constants
| Constant | Meaning |
|---|---|
COMBAT_LOG_TYPES | Supported normalized log labels |
_LOG_TYPE_NAMES | Dota combat type int -> normalized label |
_DAMAGE_TYPE_NAMES | Damage type int -> physical/magical/pure/others |
_S1_FIELD_* | Field names for legacy Source1 dota_combatlog event keys |
CombatLogEntry
Single normalized record with fields such as:
tick,log_type- attacker/target/inflictor names
value,value_name,damage_type- hero/illusion flags
ability_level,gold_reason,xp_reasonstun_duration
Name resolution helper
_resolve_name(name_table, index) maps integer indices via CombatLogNames string table.
This is critical because many combat payloads carry only numeric name references.
Ingestion paths handled
Path A: Source1 legacy game event
process_s1_event(...):
- reads integer/string/bool fields from typed
GameEvent - resolves attacker/target/inflictor indices
- maps log type int -> normalized label
- emits
CombatLogEntry
Path B: Source2 bulk user message
process_s2_bulk(...):
- iterates each
CMsgDOTACombatLogEntry - delegates each entry to
process_s2_entry(...)
Path C: Source2 direct single entry
process_s2_entry(...) handles direct HLTV-style channel too.
Important normalization details:
- purchase events resolve
value_namefrommsg.valueindex. - combat
msg.valueis decoded as signed int32 from uint32 wire value (two's complement reinterpretation). - damage type gets mapped to normalized labels.
- stun duration is included if field exists.
Parser integration points
In parser.py, this layer is invoked from:
_on_game_event(...)for S1dota_combatlog_on_user_message(...)for S2 bulk combat log_dispatch_inner(...)directCMsgDOTACombatLogEntryand chat events
This is where game-end detection also hooks in (GAME_STATE == 6 path).
Real snapshot from fixture
Using tests/fixtures/8520014563.dem and parsing up to tick 12000:
tick_end 12002
game_events_seen 90
unique_game_events 2
GE dota_chase_hero 89
GE hltv_versioninfo 1
combat_entries 2147
CL DAMAGE 1047
CL MODIFIER_ADD 211
CL MODIFIER_REMOVE 196
CL XP 183
CL DEATH 139
CL GOLD 107
CL ITEM 82
CL HEAL 65
CL PURCHASE 58
CL ABILITY 54
CL PICKUP_RUNE 5
combat_samples
(2235, 'DAMAGE', '', '', '', 3, '')
(3135, 'DAMAGE', '', '', '', 8, '')
(3618, 'MODIFIER_ADD', 'npc_dota_hero_muerta', 'npc_dota_hero_muerta', 'modifier_muerta_gunslinger', 0, '')
(3620, 'GOLD', '', 'npc_dota_hero_queenofpain', '', 600, '')
...Interpretation:
- Game-event channel can be sparse/limited depending on replay path.
- Combat log channel carries the majority of actionable timeline events.
- Normalized output has consistent shape regardless of S1/S2 input route.
Common failure modes in this layer
- Missing
CombatLogNamestable -> names resolve as empty strings. - Wrong signed-value handling -> negative gold/damage semantics break.
- S1/S2 path drift -> inconsistent outputs for same replay behavior.
- Wrong type mapping -> event getters return errors/defaults.
When symptoms are “stats/extractors look wrong but entity state seems fine,” inspect this layer next.