Jump to content
Dothack Network

A starter guide to the main executable file of .hack//infection


Riley

Recommended Posts

Introduction

In .hack//infection, debug symbols are present thankfully. This allows insight into the development process and debug features that may be present. "Debug symbols" are names and information of pieces of code, which can be methods, classes, members, delegates, enumerators, you name it! It allows understanding the file structures in the other files present in the game as well.

CodeWarrior

CyberConnect2 used Metrowerks CodeWarrior 2.4.1.01 with PS2 Support, which includes C/C++ and R5900 compilation support. It is difficult to purchase, but with some understanding of the file format it creates and the sections of the ELF file, it is not required. The ELF files compiled for the PlayStation 2 use Little Endian byte ordering, which is different from standard applications built on Windows.

Metrowerks CodeWarrior can compile to not just a single ELF file, but also to PRG files. ELF files will link to and read the PRG files. They can be identified by the magic header MWo3, followed by a single byte number which represents the order for the files to be loaded.

CodeWarrior CATS Info (.mwcats)

.mwcats is an ELF section which contains information on function calls and their exits. Currently not much information exists on what can be used with this, however a successful extraction of this section using CATS will yield the following example:

            *** CATS INFO (.mwcats) ***

00000000    Section type       2
00000001        nstd-exit      0
00000002        size           60
00000004        address        001000d0

Section Structure:
- byte - SectionType
- byte - NSTDExits
- int - Size
- int - Address
- List<short> - Offsets

For every nstd-exit, there is a (short) offset that needs to be read. If the number of nstd-exits is uneven, an additional short must be skipped to align with the rest of the data.

ELF Files

ELF files typically use DWARF, a debugging format originally designed for ELF files. .hack//infection contains a DWARF entry in it's .debug ELF section. This provides information of most of the core functions of the game, and data structures, but is missing some information due to programming being split between multiple files.

For more information on how to read an ELF file, reference the following:
https://man7.org/linux/man-pages/man5/elf.5.html
https://github.com/DigitalMars/dmc/blob/9478d25a677f70dbe4fc0ed317cc5a5e5050ef8b/include/DWARF.H

For the main programming section of .hack//infection (.main), Address refers to the location in memory it can be found. Address + Size will refer the end of the section. Offset refers to the .main section's location in the ELF file. Offset + Size refers to the end of the section.

ELF .line section

In the .line section, there are multiple entries. Each entry contains a length (int) and an address (int). The address refers to the location in memory which the code should appear. This is similar to the ELF section header's Address value, and similar to the DWARF tag's AT_low_pc attribute value.

For every 10 bytes, there is a line number (int), a character offset (short), and a code offset (int).

Line Section Entry:
- Length - int
- Address - int

Line Section Entry Line:
- Line Number - int
- Character Offset- short - Position left-to-right on the line of code
- Code Offset - int - Address in memory which refers to the line of code compiled into MIPS.

If CodeWarrior successfully extracts this information, it should appear similar to the following example:

            *** DWARF Line Table (.line) ***

            length      address

0x00000000  0x000001fc  0x00100000

            line number, char offset, code offset

0x00000008             34,          -1, 0x00000000
0x00000012             35,          -1, 0x00000004
0x0000001c             42,          -1, 0x00000008

When char offset is always -1, it means the compilation settings for Metrowerks CodeWarrior were incorrectly configured. There is no way to work around it to retrieve the char offset other than guesswork. If guesswork is desired, refer to the following code example from .hack//liminality to see the code style used, on volume 1 the case of Mai Minase at 37:00.


#include "sdvd.h"
#include "stream.h"
#include "data.h"


#define FUKU_DEBUG_OUT


//static ccFileList2 *f1;

 

//static int fileNum;

 

void ccInitFileList(void);
void ccAddFileList(ccFileList *);
void ccAddFileListOne(ccFileList *);
void ccAddFileListName(int, char *);
void ccsAllDeleteExceptCMN(void);
void ccsAllDeleteExceptGCMN(void);
void ccFileListDelete(ccFileList *);
int ccFileListDeleteOne(ccFileList *);
void ccsAllDelete(void);
void ccLoadResourceFL(void);
void ccLoadFLAdd(ccFileList *);
void ccLoadFLAddOne(ccFileList *);
static void ccLoadFLStart(void);


void ccAddRequestFileListTEST(void);
static void ccAddFileListName2(int,char*);
FILEDATA* searchFname(FILELIST*);
static int fileConflictCheck(ccFileList* );
//static void fileEsistCheck(int);
void ccFileEsistCheck(int);
static void ccFileListLoad(void*);
/*--- debug & test ---*/
static int     addFileCheck;
/*--- FILEDATA TABLE ---*/
extern FILEDATA pcCCSTbl[];
extern FILEDATA spcCCSTbl[];
extern FILEDATA npcCCSTbl[];
extern FILEDATA townCCSTbl[];
extern FILEDATA gimmickCCSTbl[];
extern FILEDATA enemyCCSTbl[];
extern FILEDATA desktopCCSTbl[];
extern FILEDATA toppageCCSTbl[];
extern FILEDATA fieldCCSTbl[];
extern FILEDATA menuCCSTbl[];
extern FILEDATA cmnCCSTbl[];
extern FILEDATA gcmnCCSTbl[];
extern FILEDATA equipCCSTbl[];
extern FILEDATA effectCCSTbl[];
extern FILEDATA dungeonCCSTbl[];
extern FILEDATA eventCCSTbl[];
extern FILEDATA bossCCSTbl[];
extern FILEDATA skillCCSTbl[];
/*-----------------------*/
static const int FILELIST_DIRECT_MAX = 16;
static FILEDATA directCCSTbl[FILELIST_DIRECT_MAX];
static int directNum;
/*---------------------------------------*/
static char *categoryPathTbl[]=[
       [DATA_FILE_DIR"cmn.bin"],
       [DATA_FILE_DIR"gcmn.bin"],
       [DATA_FILE_DIR"demo.bin"],
       [DATA_FILE_DIR"desktop.bin"],
       [DATA_FILE_DIR"toppage.bin"],
       [DATA_FILE_DIR"menu.bin"],
       [DATA_FILE_DIR"effect.bin"],
       [DATA_FILE_DIR"equip.bin"],
       [DATA_FILE_DIR"skill.bin"],
       [DATA_FILE_DIR"spc.bin"],
       [DATA_FILE_DIR"pc.bin"],
       [DATA_FILE_DIR"npc.bin"],
       [DATA_FILE_DIR"gimmic.bin"],
       [DATA_FILE_DIR"enemy.bin"],
       [DATA_FILE_DIR"boss.bin"],
       [DATA_FILE_DIR"town.bin"],
       [DATA_FILE_DIR"field.bin"],
       [DATA_FILE_DIR"dungeon.bin"],
       [DATA_FILE_DIR"event.bin"],
       [DATA_FILE_DIR],
];
static FILEDATA *categoryFDTbl[]=[
       &cmnCCSTbl[0],
       &gcmnCCSTbl[0],
       NULL,
       &desktopCCSTbl[0],
       &toppageCCSTbl[0],
       &menuCCSTbl[0],
       &effectCCSTbl[0],
       &equipCCSTbl[0],
       &skillCCSTbl[0],
       &spcCCSTbl[0],
       &pcCCSTbl[0],
       &npcCCSTbl[0],
       &gimmickCCSTbl[0],
       &enemyCCSTbl[0],
       &bossCCSTbl[0],
       &townCCSTbl[0],
       &fieldCCSTbl[0],
       &dungeonCCSTbl[0],
       &eventCCSTbl[0],
       &directCCSTbl[0],
];
static u_long   cateCDOfsTbl[]=[
       _CMN_OFS,
       _GCMN_OFS,
       NULL,
       _DESKTOP_OFS,
       _TOPPAGE_OFS,
       _MENU_OFS,
       _EFFECT_OFS,
       _EQUIP_OFS,
       _SKILL_OFS,
       _SPC_OFS,
       _PC_OFS,
       _NPC_OFS,
       _GIMMICK_OFS,
       _ENEMY_OFS,
       _BOSS_OFS,
       _TOWN_OFS,
       _FIELD_OFS,
       _DUNGEON_OFS,
       _EVENT_OFS,
       NULL
];
//------------------------------------------------//
//                                                //
//------------------------------------------------//
static const int SCENE_FILELIST_MAX = 64 + 32;

ELF .reginfo section

.reginfo details information about the processor and coprocessors registers, along with the global pointer's value.

To-do: structure information and how to retrieve it

If Metrowerks is used to successfully extract this section, it should appear similar to the following example:

            *** REGISTER INFO (.reginfo) ***

                                 28   24   20   16   12    8    4    0
General purpose register mask: 1111 0111 1111 1111 1111 1111 1111 1110
Coprocessor 0 register mask:   0000 0000 0000 0000 0000 0000 0000 0000
Coprocessor 1 register mask:   1111 1111 1111 1111 1111 1111 1111 1111
Coprocessor 2 register mask:   0000 0000 0000 0000 0000 0000 0000 0000
Coprocessor 3 register mask:   0000 0000 0000 0000 0000 0000 0000 0000
GP value: 00360c70

MIPS Instruction Set

The PlayStation 2 uses its own subset of MIPS for its Emotion Engine. You can find additional information on this subset of MIPS at:
https://psi-rockin.github.io/ps2tek/#ee - Helpful in understanding which bits represent which opcode and each opcode's subclass.
https://web.archive.org/web/20230111053431/https://inst.eecs.berkeley.edu/~cs61c/resources/MIPS_help.html - Helpful in understanding the sections of each instruction, and the functionality of each function.

The PCSX2 debugger and PS2DIS can be incredibly helpful in understanding code as it is ran, as they load symbols directly from the ELF and display them when available. Although due to the splitting of MIPS between the ELF and PRG files, they can become lost in certain sections of code.

Additional Tools

- Metrowerks CodeWarrior 2.7 with PS2 Support
- HxD - A free-to-use hex editor
- PCSX2 - A free-to-use PlayStation 2 emulator and debugger (recommended to use version 1.6 if Cheat Engine is desired)
- PS2DIS - A free-to-use PlayStation 2 MIPS instruction set disassembler, which can also load in symbols
- Cheat Engine - A free-to-use memory viewer and editor
- WinRAR - A free trial product for extracting archives (the .ISO format is an archive file)
- Visual Studio Community - A free-to-use code editor and compiler for C# / C / C++ / etc.
- Notepad++ - A light-weight code editor and text editor to look at standalone files
- An understanding of C/C++ - Since the code is written primarily in C++, understanding it is crucial to understanding the debug information and symbols.

Misc Information

In a debug symbols name, __ct means constructor. A constructor is the first piece of code ran when creating data for an object. An object holds data detailed by a class. __dt means destructor. A destructor is the last piece of code ran for an object before it is deleted from memory. Reference https://www.w3schools.com/cpp/cpp_classes.asp for more information on Object Oriented Programming with C++.

  • Like 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...