Skip to content

C# to Node.js Migration Context Document

Purpose: This document provides comprehensive context for migrating Excel manipulation logic from C# (using Spire.XLS) to Node.js (using Aspose). It is intended to be consumed by a planning agent to create a systematic migration strategy for work in 5-10 hour chunks.

Repository Context: This is the nextgen-cs repository containing the C# tools. Migration work will be performed in a separate repository that currently calls these tools via command line.


Table of Contents

  1. Executive Summary
  2. Tool Inventory
  3. SCRIBE - Generic Excel Processor
  4. AFR Templater - Financial Report Generator
  5. Spire.XLS Usage Patterns
  6. Migration Considerations
  7. Testing Requirements
  8. Appendix: Code Locations

Executive Summary

Current State

  • Two C# tools built on .NET 7.0 using Spire.XLS 13.12.0 for Excel manipulation
  • Licensed Spire.XLS requiring SPIRE_LICENSE_KEY environment variable
  • Command-line interface tools called from a separate Node.js repository
  • Docker-based deployment with specific Linux dependencies for graphics support

Migration Goals

  • Replace Spire.XLS with Aspose for Node.js
  • Move all Excel manipulation logic into the existing Node.js tech stack
  • Maintain functional parity with existing tools
  • Work in manageable 5-10 hour chunks
  • Preserve extensive business logic for Arizona educational financial reporting

Complexity Assessment

Tool Lines of Code Complexity Business Logic Density
SCRIBE ~2,500 Medium Low (generic operations)
AFR Templater ~6,500 High Very High (domain-specific)

Tool Inventory

1. SCRIBE (Streamlined Cell Runtime for Business Execution)

Purpose: Generic Excel automation tool that executes instruction files

Location: src/Scribe/

Complexity: Medium

Migration Priority: High (simpler, useful foundation for AFR Templater migration)

2. AFR Templater

Purpose: Specialized tool for generating Annual Financial Reports for Arizona school districts

Location: src/afr-templater/

Complexity: High

Migration Priority: High (core business value)


SCRIBE - Generic Excel Processor

Overview

SCRIBE is a domain-agnostic CLI tool for Excel automation. It reads .scribe instruction files containing line-by-line commands for manipulating Excel workbooks. Think of it as a simple scripting language for Excel operations.

Architecture

Program.cs (Entry Point)
    ↓
InstructionParser.cs (Parse .scribe file)
    ↓
List<BaseInstruction> (Parsed instruction objects)
    ↓
InstructionInvoker.cs (Execute instructions in order)
    ↓
ScribeStore.cs (In-memory workbook/cell storage)
    ↓
SpireHelper.cs (Abstraction over Spire.XLS)

Command-Line Interface

scribe process --instructionsPath tmp/example.scribe

Input: .scribe text files with format OPERATION:arg1:arg2:...

Output: Modified Excel files saved to disk

Instruction Set (12 Operations)

File Operations

  1. LOAD - Load Excel file into memory

    • Format: LOAD:<filepath>:<name>
    • Stores workbook in ScribeStore with given name
    • Example: LOAD:tmp/2023AFRTemplate.xlsx:AFRTemplate
  2. SAVE - Write workbook to disk

    • Format: SAVE:<workbook_name>:<filepath>
    • Example: SAVE:AFRTemplate:tmp/2023AFRFinalNew.xlsx

Calculation Operations

  1. CALCULATE - Force formula recalculation
    • Workbook: CALCULATE:<workbook>
    • Worksheet: CALCULATE:<workbook>:<worksheet>
    • Cell: CALCULATE:<workbook>:<worksheet>:<cell>
    • Range: CALCULATE:<workbook>:<worksheet>:<start>:<end>

Cell Copy/Paste Operations

  1. COPY - Copy single cell value to memory

    • Format: COPY:<workbook>:<worksheet>:<cell>:<name>
    • Stores cell value in ScribeStore with given name
    • Preserves cell type (formula, number, text, etc.)
  2. COPY_RANGE - Copy cell range to memory

    • Format: COPY_RANGE:<workbook>:<worksheet>:<start>:<end>:<name>
    • Example: COPY_RANGE:AFRTemplate:SourceSheet:A1:D10:MyRange
  3. PASTE - Paste previously copied cell

    • Format: PASTE:<workbook>:<worksheet>:<cell>:<name>
    • Retrieves from ScribeStore by name
  4. PASTE_RANGE - Paste previously copied range

    • Format: PASTE_RANGE:<workbook>:<worksheet>:<start>:<end>:<name>

Data Writing Operations

  1. WRITE - Write typed value to cell
    • Format: WRITE:<workbook>:<worksheet>:<cell>:<datatype>:<value>
    • Data Types:
      • TEXT - String value
      • NUMBER - Numeric value
      • DOLLAR - Currency value
      • DATE - MM/dd/yyyy format (converts to Excel serial)
      • FORMULA - Excel formula (starts with =)
      • BOOLEAN - TRUE/FALSE
      • DROPDOWN - Data validation dropdown
      • BLANK - Clear cell
    • Examples: WRITE:AFRTemplate:Cover:D3:TEXT:Tolleson Union High School WRITE:AFRTemplate:Cover:F16:NUMBER:100 WRITE:AFRTemplate:Data Entry:Q1:DOLLAR:100.97 WRITE:AFRTemplate:Page 8:J10:FORMULA:=SUM(A1:A10) WRITE:AFRTemplate:Config:A1:DATE:12/31/2023 WRITE:AFRTemplate:Status:B2:BOOLEAN:TRUE WRITE:AFRTemplate:Clean:C3:BLANK:

Arithmetic Operations

  1. ADD_CELL - Add one cell’s value to another cell

    • Format: ADD_CELL:<wb>:<ws>:<target>:<wb>:<ws>:<source>
    • Reads numeric value from source, adds to target
    • Example: ADD_CELL:AFRTemplate:Data:Q1:AFRTemplate:Data:Q2
  2. ADD_RAW - Add raw number to cell

    • Format: ADD_RAW:<workbook>:<worksheet>:<cell>:<number>
    • Example: ADD_RAW:AFRTemplate:Data Entry:Q1:253.46
  3. MULTIPLY_RANGE - Multiply all cells in range by number

    • Format: MULTIPLY_RANGE:<range_name>:<multiplier>:<decimal_places>
    • Requires prior COPY_RANGE to store range with name
    • Example: MULTIPLY_RANGE:RANGE_TO_MULT:1.725:3

Debug Operations

  1. DUMP - Output cell information for debugging
    • Format: DUMP:<workbook>:<worksheet>:<cell>
    • Prints type and value to console

Key Components

ScribeStore (src/Scribe/Utilities/ScribeStore.cs)

Purpose: In-memory storage during instruction execution

Data Structures:

  • Dictionary<string, Workbook> - Named workbooks
  • Dictionary<string, CellRange> - Named cell ranges (for copy/paste)
  • Dictionary<string, XlsCellValue> - Named single cells

Methods:

  • AddWorkbook(name, workbook)
  • GetWorkbook(name)
  • AddCellRange(name, range)
  • GetCellRange(name)

SpireHelper (src/Scribe/Utilities/SpireHelper.cs)

Purpose: Abstraction layer over Spire.XLS API

Critical Functions:

  1. ExtractValue(CellRange)XlsCellValue

    • Determines cell type (formula, number, text, boolean, date, blank)
    • Extracts value in appropriate format
    • Handles formula evaluation
    • Type detection logic: csharp if (cell.HasFormula) → XlsCellValueType.FORMULA if (cell.IsBlank) → XlsCellValueType.BLANK if (cell.IsDateTime) → XlsCellValueType.DATE if (cell.Value == "TRUE" || "FALSE") → XlsCellValueType.BOOLEAN if (cell.NumberValue exists) → XlsCellValueType.NUMBER else → XlsCellValueType.TEXT
  2. ExtractNumber(CellRange)double

    • Extracts numeric value from cell
    • For formulas: uses FormulaNumberValue
    • For text: uses NumberValue
    • Returns 0 if non-numeric
  3. CalculateCell/Range/Worksheet/Workbook()

    • Calls Spire’s CalculateAllValue() at appropriate scope
    • Used before reading formula results
  4. SetCellValue(CellRange, XlsCellValue)

    • Writes value to cell based on type
    • Handles all 8 data types (TEXT, NUMBER, DOLLAR, DATE, FORMULA, BOOLEAN, DROPDOWN, BLANK)
    • Excel Date Conversion: Converts MM/dd/yyyy to Excel serial number
      • Accounts for Excel 1900 leap year bug (treats 1900 as leap year)
      • Formula: Days since 1/1/1900 + 1 (for bug), +1 (for Excel starting at 1)
  5. SetNumberFormat(CellRange, format)

    • Applies Excel number formatting
    • Used for currency, dates, percentages, etc.
  6. CopyCellValue(source, target)

    • Copies cell value preserving type
    • Copies number format
    • Copies formulas

InstructionParser (src/Scribe/Instructions/InstructionParser.cs)

Purpose: Parse .scribe files into instruction objects

Parsing Logic:

  1. Read file line-by-line
  2. Split by : delimiter
  3. First token = operation type
  4. Remaining tokens = arguments
  5. Validate argument count for operation type
  6. Instantiate appropriate BaseInstruction subclass
  7. Track line numbers for error reporting

Error Handling:

  • Invalid file paths
  • Unknown operation types
  • Incorrect argument counts
  • Malformed instruction syntax

InstructionInvoker (src/Scribe/Instructions/InstructionInvoker.cs)

Purpose: Execute parsed instructions in order

Execution Model:

  1. Iterate through instruction list
  2. Call Execute() on each instruction
  3. Instructions access ScribeStore to get/set workbooks
  4. Instructions use SpireHelper for Excel operations
  5. Bubble up exceptions with line number context

Instruction Implementation Pattern

Each instruction inherits from BaseInstruction interface:

public interface BaseInstruction
{
    void Execute();
}

Example: LoadInstruction.cs

public class LoadInstruction : BaseInstruction
{
    private string _filePath;
    private string _name;
    private ScribeStore _store;

    public LoadInstruction(string filePath, string name, ScribeStore store)
    {
        _filePath = filePath;
        _name = name;
        _store = store;
    }

    public void Execute()
    {
        var workbook = new Workbook();
        workbook.LoadFromFile(_filePath);
        _store.AddWorkbook(_name, workbook);
    }
}

Dependencies on Spire.XLS

Core Spire Types Used:

  • Spire.Xls.Workbook - Workbook object
  • Spire.Xls.Worksheet - Worksheet object
  • Spire.Xls.CellRange - Cell/range object
  • Spire.Xls.ExcelVersion - File format specification

Core Spire Operations:

  • Workbook.LoadFromFile(path) - Load Excel file
  • Workbook.SaveToFile(path, ExcelVersion) - Save Excel file
  • Workbook.Worksheets[name] - Get worksheet by name
  • Worksheet.Range[address] - Get cell/range by address (e.g., “A1”, “B2:D10”)
  • CellRange.Value - Get/set cell value as string
  • CellRange.NumberValue - Get cell numeric value
  • CellRange.FormulaNumberValue - Get formula result as number
  • CellRange.Formula - Get/set cell formula
  • CellRange.Text - Get cell formatted text
  • CellRange.HasFormula - Check if cell has formula
  • CellRange.IsBlank - Check if cell is empty
  • CellRange.IsDateTime - Check if cell is date
  • CellRange.NumberFormat - Get/set cell format string
  • CellRange.CalculateAllValue() - Recalculate formulas
  • CellRange.Clear() - Clear cell contents

Testing

Test Project: tests/Scribe.Tests/

Test Structure:

  • Unit tests for individual instructions
  • Integration tests with sample Excel files
  • Snapshot testing using Verify library
  • Mock ISpireHelper interface for unit testing

Sample Test Files: tests/testfiles/

  • Example .scribe files
  • Sample Excel templates
  • Expected output files

Migration Notes for SCRIBE

Straightforward Migrations

  • File loading/saving
  • Cell reading/writing by address
  • Basic cell operations (copy/paste)
  • Number formatting
  • Formula evaluation

Complex Migrations

  • Excel date conversion (1900 leap year bug handling)
  • Type detection logic (formula vs. value vs. text)
  • Cell range copying with format preservation
  • Dropdown validation setup

Aspose Equivalents Needed

  • Workbook.LoadFromFile() → Aspose workbook loading
  • Worksheet.Range[address] → Aspose cell/range access
  • CellRange.Value, NumberValue, Formula → Aspose cell value access
  • CellRange.NumberFormat → Aspose formatting
  • CalculateAllValue() → Aspose formula calculation
  1. Phase 1: Core infrastructure (ScribeStore, instruction parser, invoker)
  2. Phase 2: File operations (LOAD, SAVE)
  3. Phase 3: Read operations (COPY, COPY_RANGE)
  4. Phase 4: Write operations (WRITE with all types, PASTE, PASTE_RANGE)
  5. Phase 5: Calculation operations (CALCULATE)
  6. Phase 6: Arithmetic operations (ADD_CELL, ADD_RAW, MULTIPLY_RANGE)
  7. Phase 7: Testing and validation

AFR Templater - Financial Report Generator

Overview

AFR Templater is a specialized tool for generating Annual Financial Reports (AFR), school-level reports, and Indirect Cost (IDC) reports for Arizona school districts. It contains extensive business logic specific to Arizona educational financial reporting requirements.

Critical Context: This tool encodes complex, state-mandated reporting requirements that change annually. The business logic is not generic Excel manipulation—it’s domain-specific financial reporting for Arizona K-12 school districts.

Architecture

Program.cs (CLI Entry)
    ↓
ProcessorFactory (Select year-specific processor)
    ↓
FiscalYear202XProcessor (Year-specific logic)
    ↓
Processor (Base class with common logic)
    ↓
├── NormalizedRecordAggregator (Filter/aggregate financial records)
├── CodeConverter (Transform account codes)
├── SpireHelper (Excel operations)
└── Tababular (Table formatting)

Command-Line Interface

1. Generate AFR Report

afr-templater generate-afr-report \
  --config district_config.json \
  --sourceDir /path/to/source/files

Inputs:

  • district_config.json - District configuration (see Config Format below)
  • normalized_records.csv - Financial transaction records
  • {Year}AFRTemplate.xls - Year-specific AFR template
  • Optional: BSA-21 files, budget files

Outputs:

  • {Year} AFR.xls - Main Annual Financial Report
  • {Year} Deseg AFR.xls - Desegregation report (if applicable)
  • {Year} Food Service AFR.xls - Food service report (pre-FY2025)

2. Generate School Report

afr-templater generate-school-report \
  --config district_config.json \
  --sourceDir /path/to/source/files

Outputs:

  • {Year} School-Level Reporting for {N} schools.xls[m] - School-level financial report
  • Format varies by school count:
    • 1 school: Integrated into main AFR (Page 9)
    • 2-25 schools: Standard Excel workbook
    • 26-45 schools: Macro-enabled workbook (.xlsm)

3. Generate IDC Report

afr-templater generate-idc-report \
  --config district_config.json \
  --sourceDir /path/to/source/files

Outputs:

  • {Year} IDC.xlsx - Indirect Cost Report

Config Format (district_config.json)

{
    "entityName": "Tolleson Union High School District",
    "ctdsCode": "070290000",
    "county": "MARICOPA",
    "fiscalYear": 2025,
    "useDeseg": true,
    "useFoodService": false,
    "schools": [
        {
            "unitCode": "001",
            "schoolName": "Tolleson Union High School",
            "isPrivate": false
        },
        {
            "unitCode": "002",
            "schoolName": "La Joya Community High School",
            "isPrivate": false
        }
    ],
    "districtContacts": {
        "preparedBy": "Jane Smith",
        "superintendent": "John Doe",
        "businessManager": "Bob Johnson"
    }
}

Normalized Records Format (normalized_records.csv)

Columns:

  • fund_code (3 digits) - Fund classification (e.g., 001 = General Fund)
  • program_code (3 digits) - Program classification
  • function_code (4 digits) - Functional category (e.g., 1000 = Instruction)
  • object_code (4 digits) - Object classification (e.g., 6100 = Salaries)
  • unit_code (3 digits) - School/location code
  • amount_in_cents (integer) - Transaction amount in cents
  • description (text) - Transaction description
  • source_line (integer) - Original file line number
  • original_account_code (text) - Original account code before normalization

Example:

fund_code,program_code,function_code,object_code,unit_code,amount_in_cents,description,source_line,original_account_code
001,000,1000,6100,001,5000000,Teacher Salaries,42,001-000-1000-6100-001
001,000,2100,6200,001,150000,Admin Benefits,43,001-000-2100-6200-001

Key Insights:

  • Amounts in cents (not dollars) for precision
  • 5-part account code structure (fund-program-function-object-unit)
  • Records are “normalized” from various source formats
  • Balance records use object codes 9100-9999

Processor Architecture

The processor architecture uses inheritance for year-specific customization:

Processor.cs (Base Class - ~770 lines)
    ↑
    ├── FiscalYear2022Processor.cs
    ├── FiscalYear2023Processor.cs
    ├── FiscalYear2024Processor.cs
    ├── FiscalYear2025Processor.cs (~1,984 lines - most complex)
    └── FiscalYear2026Processor.cs

ProcessorFactory.cs selects processor based on fiscal year from config.

Base Processor Class (Processor.cs)

Core Methods

Public API:

  • GenerateAFR() - Generate main AFR
  • GenerateDesegAFR() - Generate desegregation report
  • GenerateFoodServiceAFR() - Generate food service report
  • GenerateSchoolReport() - Generate school-level report
  • GenerateIDCReport() - Generate indirect cost report

Template Population Pipeline:

LoadTemplate()
    ↓
PopulateAFR() [Main orchestrator]
    ↓
├── FillCoverPage()
├── FillAccountingData() [Most complex]
├── FillSchoolData()
├── FillCovidData() (FY2023+)
├── FillBalanceSheet()
├── FillCustomPages() [Year-specific overrides]
└── PerformFinalCalculations()
    ↓
SaveReport()

Key Overridable Methods (Template Method Pattern)

Year-specific processors override these methods to customize behavior:

  1. FillAccountingData(workbook, records) - Populate financial data
  2. FillCustomPages(workbook, records) - Handle year-specific pages
  3. GetPageMapping() - Define which pages exist for this year
  4. GetBalanceAccounts() - Define balance sheet structure
  5. ApplyYearSpecificFormatting(workbook) - Formatting customizations
  6. ValidateTemplate(workbook) - Ensure template has required structure

Critical Shared Logic

School Categorization:

int schoolCount = config.schools.Count;
if (schoolCount == 1)
    // Use AFR Page 9 for single school
else if (schoolCount <= 25)
    // Generate .xls workbook
else if (schoolCount <= 45)
    // Generate .xlsm macro-enabled workbook
else
    throw new Exception("Too many schools (max 45)");

CTDS Type Detection:

// Type 3 CTDS = JTED (Career Technical Education Districts)
bool isType3 = ctdsCode[6] == '3';
if (isType3) {
    // Special handling for JTED districts
}

Private School Detection:

// Private schools excluded from certain reports
var publicSchools = config.schools.Where(s => !s.isPrivate).ToList();

FiscalYear2025Processor (Most Recent)

Size: ~1,984 lines (largest processor)

Unique Features:

  1. Integrated Food Service - Food service merged into main AFR (no separate report)
  2. COVID-19 Reporting (Page 10) - Federal relief fund tracking
  3. Updated Object Code Ranges - New classifications for post-pandemic funding
  4. School-Level Changes - Modified school report templates

Custom Overrides:

  • PopulateFoodServiceData() - Integrate food service into Page 7
  • FillCovidPage() - Populate Page 10 with federal relief funds
  • AdjustBalanceSheet() - Updated balance sheet structure
  • ApplyConditionalFormatting() - Year-specific cell formatting

Key Differences from 2024:

  • Food service no longer separate report
  • New object codes for ESSER (Elementary and Secondary School Emergency Relief) funds
  • Modified Page 1-2 structure
  • Updated balance forward calculations

Code Converter System

Purpose: Transform account codes according to fiscal year rules

CodeConverter.cs (Base Class - ~517 lines)

Core Responsibilities:

  1. Rollup Fund Calculation - Group funds for reporting
  2. Final Fund Determination - Map detailed funds to reporting funds
  3. Primary Unit Code Mapping - Aggregate sub-locations to primary locations
  4. Balance Record Handling - Special treatment for balance records

Key Data Structures:

  1. RollupFundCodes - Fund grouping definitions
Dictionary<string, HashSet<string>> rollupFundCodes = new()
{
    ["100"] = new() { "001", "010", "011", "012", "013", "014" }, // General funds
    ["200"] = new() { "020", "021", "022", "023", "024" },        // Special revenue
    ["600"] = new() { "610", "620", "625", "630", "640" },        // Food service
    // ... etc
};
  1. PrimaryUnitCodeMap - Map unit codes to primary codes
// Example: All elementary schools map to primary code "001"
Dictionary<string, string> primaryUnitCodeMap = new()
{
    ["001"] = "001", // Main Elementary
    ["002"] = "001", // Branch Elementary → maps to 001
    ["003"] = "001", // Another Branch → maps to 001
    ["100"] = "100", // Main High School
    ["101"] = "100", // Branch High School → maps to 100
};
  1. Object Code Ranges - Define balance vs. operating records
int[] balanceObjectCodeRanges = { 9100, 9999 }; // Balance sheet items
int[] operatingObjectCodeRanges = { 1000, 8999 }; // Operating transactions

Key Methods:

  1. CalcRollUpFund(fundCode)string

    • Maps detailed fund code to rollup fund
    • Example: CalcRollUpFund("011")"100" (General Fund rollup)
  2. CalcFinalFund(fundCode, objectCode)string

    • Determines final fund for reporting
    • Considers both fund and object code
    • May reclassify based on object code
  3. CalcPrimaryUnitCode(unitCode)string

    • Maps unit code to primary unit code
    • Used for school-level aggregation
    • Example: Branch schools map to main school
  4. IsBalanceRecord(objectCode)bool

    • Determines if record is balance sheet item
    • Checks if object code in range 9100-9999
  5. CalcBalanceFundTotals(records)Dictionary<string, long>

    • Aggregate balance records by fund
    • Returns total amounts per fund in cents

Year-Specific Code Converters

Each fiscal year has custom converter with rule variations:

FiscalYear2025CodeConverter.cs - Most Recent

  • Updated rollup fund definitions
  • New fund codes for federal relief (ESSER, GEER)
  • Modified object code classifications
  • Updated primary unit code mappings

Key Year-to-Year Changes:

  • New fund codes introduced for COVID relief (2023+)
  • Object code ranges adjusted for new classifications
  • Unit code mappings updated as schools open/close
  • Balance sheet structure changes

NormalizedRecordAggregator

Purpose: Filter and aggregate financial records using sophisticated mask-based pattern matching

Location: src/afr-templater/utilities/NormalizedRecordAggregator.cs

Mask Syntax

Masks use a 5-part format: {fund}.{program}.{function}.{object}.{unit}

Wildcard Support:

  • X - Single wildcard digit
  • Example: 61XX matches 6100-6199
  • Example: 6XXX matches 6000-6999

Range Support:

  • {start-end} - Inclusive range
  • Example: {1000-1899} matches 1000, 1001, …, 1899

Multiple Values:

  • {val1;val2;val3} - Match any of the values
  • Example: {001;010;020} matches unit codes 001, 010, or 020

Compound Patterns:

  • {61XX;6200-6299} - Combine wildcards and ranges
  • Example: Matches 6100-6199 OR 6200-6299

Example Masks:

XXX.XXX.1000.XXXX.XXX - All function 1000 records (instruction)
001.XXX.XXXX.61XX.XXX - Fund 001, object codes 6100-6199 (salaries)
XXX.XXX.{1000-1999}.XXXX.001 - Functions 1000-1999, unit 001
{001;010}.XXX.XXXX.XXXX.XXX - Funds 001 or 010, any other codes
XXX.XXX.61XX.{6100;6200}.XXX - Function 61XX, objects 6100 or 6200

Core Methods

1. AggregateRecordAmounts()

long AggregateRecordAmounts(
    List<NormalizedRecord> records,
    List<List<string>> inclusionMasks,
    List<List<string>> exclusionMasks = null,
    bool flipSign = false,
    bool forcePositive = false
)

Purpose: Sum amounts of records matching inclusion masks, excluding exclusion masks

Parameters:

  • inclusionMasks - List of mask groups (OR between groups, AND within group)
  • exclusionMasks - List of mask groups to exclude
  • flipSign - Multiply result by -1
  • forcePositive - Take absolute value

Logic:

For each record:
  If (matches any inclusion mask group AND matches all masks in that group):
    If (NOT matches any exclusion mask):
      Add record amount to total

If flipSign: total *= -1
If forcePositive: total = abs(total)
Return total

Example Usage:

// Sum all teacher salaries (object 6100-6199) in Fund 001 for Instruction (1000)
var inclusionMasks = new List<List<string>>
{
    new List<string> { "001.XXX.1000.61XX.XXX" }
};
long totalSalaries = aggregator.AggregateRecordAmounts(records, inclusionMasks);

2. BulkAggregate()

Dictionary<string, long> BulkAggregate(
    List<NormalizedRecord> records,
    Dictionary<string, BulkAggregateParams> aggregations
)

Purpose: Perform multiple aggregations in a single pass through records (performance optimization)

Parameters:

  • aggregations - Dictionary mapping result keys to aggregation parameters

Returns: Dictionary mapping keys to aggregated amounts

Example:

var aggregations = new Dictionary<string, BulkAggregateParams>
{
    ["total_salaries"] = new BulkAggregateParams
    {
        InclusionMasks = new() { new() { "XXX.XXX.XXXX.61XX.XXX" } }
    },
    ["total_benefits"] = new BulkAggregateParams
    {
        InclusionMasks = new() { new() { "XXX.XXX.XXXX.62XX.XXX" } }
    }
};
var results = aggregator.BulkAggregate(records, aggregations);
long salaries = results["total_salaries"];
long benefits = results["total_benefits"];

3. ContainsRecordMatchingMasks()

bool ContainsRecordMatchingMasks(
    List<NormalizedRecord> records,
    List<List<string>> inclusionMasks,
    List<List<string>> exclusionMasks = null
)

Purpose: Check if any records match criteria (boolean search)

Use Cases:

  • Determine if district has desegregation funds
  • Check if food service operations exist
  • Validate required account codes are present

Mask Matching Algorithm

Implementation:

bool DoesRecordMatchMaskGroup(NormalizedRecord record, List<string> maskGroup)
{
    // Record must match ALL masks in group (AND logic)
    foreach (var mask in maskGroup)
    {
        if (!DoesRecordMatchMask(record, mask))
            return false;
    }
    return true;
}

bool DoesRecordMatchMask(NormalizedRecord record, string mask)
{
    string[] maskParts = mask.Split('.');
    string[] recordParts = {
        record.FundCode,
        record.ProgramCode,
        record.FunctionCode,
        record.ObjectCode,
        record.UnitCode
    };

    for (int i = 0; i < 5; i++)
    {
        if (!DoesPartMatch(recordParts[i], maskParts[i]))
            return false;
    }
    return true;
}

bool DoesPartMatch(string recordValue, string maskPattern)
{
    if (maskPattern == "XXX" || maskPattern == "XXXX")
        return true; // Wildcard matches anything

    if (maskPattern.StartsWith("{"))
    {
        // Parse compound pattern: {val1;val2;range1-range2}
        return MatchesCompoundPattern(recordValue, maskPattern);
    }

    if (maskPattern.Contains("X"))
    {
        // Wildcard pattern: 61XX matches 6100-6199
        return MatchesWildcardPattern(recordValue, maskPattern);
    }

    return recordValue == maskPattern; // Exact match
}

Real-World Usage Examples

Example 1: Calculate Total Instruction Expenses

// All expenses (object 6XXX) for Instruction (function 1000)
var inclusionMasks = new List<List<string>>
{
    new List<string> { "XXX.XXX.1000.6XXX.XXX" }
};
long instructionTotal = aggregator.AggregateRecordAmounts(records, inclusionMasks);

Example 2: Calculate Fund 001 Revenues Excluding Transfers

// Fund 001, all revenue object codes (5XXX) except transfers (5200)
var inclusionMasks = new List<List<string>>
{
    new List<string> { "001.XXX.XXXX.5XXX.XXX" }
};
var exclusionMasks = new List<List<string>>
{
    new List<string> { "001.XXX.XXXX.5200.XXX" }
};
long revenuesExcludingTransfers = aggregator.AggregateRecordAmounts(
    records, inclusionMasks, exclusionMasks);

Example 3: Calculate School-Specific Salaries

// Salaries (61XX) for specific school (unit 003)
var inclusionMasks = new List<List<string>>
{
    new List<string> { "XXX.XXX.XXXX.61XX.003" }
};
long schoolSalaries = aggregator.AggregateRecordAmounts(records, inclusionMasks);

Example 4: Complex Multi-Fund Aggregation

// Salaries in General Fund (001) OR Special Revenue (020)
var inclusionMasks = new List<List<string>>
{
    new List<string> { "001.XXX.XXXX.61XX.XXX" }, // Fund 001 salaries
    new List<string> { "020.XXX.XXXX.61XX.XXX" }  // Fund 020 salaries
};
// Note: OR between mask groups
long totalSalaries = aggregator.AggregateRecordAmounts(records, inclusionMasks);

Excel Template Structure

AFR templates are complex Excel workbooks with:

Page Structure (varies by year):

  • Cover - District identification, signatures
  • Page 1 - Balance Sheet (Assets)
  • Page 2 - Balance Sheet (Liabilities & Fund Balance)
  • Page 3 - Statement of Revenues, Expenditures, and Changes in Fund Balance (Part I)
  • Page 4 - Statement of Revenues, Expenditures, and Changes in Fund Balance (Part II)
  • Page 5 - Expenditures by Function
  • Page 6 - Expenditures by Object
  • Page 7 - Food Service Operations (integrated in FY2025+)
  • Page 8 - Reconciliation
  • Page 9 - Single School District Data (conditional)
  • Page 10 - COVID-19 Federal Relief Funds (FY2023+)
  • Accounting Data - Raw data sheet (hidden)
  • School Listing - School roster
  • Fund Balance Reserve - Reserve calculations

Cell References: AFR pages use a mix of:

  • Direct values (populated by tool)
  • Formulas referencing Accounting Data sheet
  • Cross-page formula references

Data Population Pattern:

  1. Populate Accounting Data sheet with aggregated amounts
  2. Page formulas automatically calculate from Accounting Data
  3. Tool also directly populates certain cells on pages
  4. Tool clears conditional formatting that interferes with formulas

CellInfo Mapping System

Purpose: Define source-to-destination cell mappings with transformations

Structure:

public class CellInfo
{
    public string SourceWorksheet { get; set; }
    public string SourceCell { get; set; }
    public string DestinationWorksheet { get; set; }
    public string DestinationCell { get; set; }
    public Func<object, object> Transform { get; set; } // Optional transformation
    public string NumberFormat { get; set; } // Optional formatting
    public bool ClearConditionalFormatting { get; set; }
    public bool ForceText { get; set; }
    public bool ForceNumber { get; set; }
}

Example Usage:

var mappings = new List<CellInfo>
{
    new CellInfo
    {
        SourceWorksheet = "Config",
        SourceCell = "A1",
        DestinationWorksheet = "Cover",
        DestinationCell = "D3",
        ForceText = true
    },
    new CellInfo
    {
        SourceWorksheet = "Data",
        SourceCell = "B10",
        DestinationWorksheet = "Page 5",
        DestinationCell = "F16",
        Transform = (val) => (long)val / 100, // Convert cents to dollars
        NumberFormat = "#,##0.00"
    }
};

foreach (var mapping in mappings)
{
    var sourceValue = GetCellValue(mapping.SourceWorksheet, mapping.SourceCell);
    var transformedValue = mapping.Transform?.Invoke(sourceValue) ?? sourceValue;
    SetCellValue(mapping.DestinationWorksheet, mapping.DestinationCell,
                 transformedValue, mapping.NumberFormat);
}

FillAccountingData - The Core Method

Purpose: Populate the Accounting Data worksheet with aggregated financial amounts

This is the most complex method in the entire codebase (~500+ lines in some processors)

High-Level Flow:

1. Initialize Accounting Data worksheet
2. For each row in Accounting Data template:
   a. Read row definition (fund, function, object ranges)
   b. Build inclusion/exclusion masks based on row definition
   c. Aggregate records using NormalizedRecordAggregator
   d. Convert cents to dollars
   e. Write amount to cell
   f. Apply number formatting
3. Calculate subtotals and totals
4. Populate balance forward amounts
5. Perform cross-sheet validations

Pseudo-Code Example:

void FillAccountingData(Workbook workbook, List<NormalizedRecord> records)
{
    var ws = workbook.Worksheets["Accounting Data"];
    var aggregator = new NormalizedRecordAggregator();

    // Row 10: Fund 001, Function 1000, Objects 6100-6199 (Instruction Salaries)
    var masks = new List<List<string>>
    {
        new List<string> { "001.XXX.1000.{6100-6199}.XXX" }
    };
    long amountInCents = aggregator.AggregateRecordAmounts(records, masks);
    double amountInDollars = amountInCents / 100.0;
    ws.Range["F10"].NumberValue = amountInDollars;
    ws.Range["F10"].NumberFormat = "#,##0.00";

    // Row 11: Fund 001, Function 1000, Objects 6200-6299 (Instruction Benefits)
    masks = new List<List<string>>
    {
        new List<string> { "001.XXX.1000.{6200-6299}.XXX" }
    };
    amountInCents = aggregator.AggregateRecordAmounts(records, masks);
    amountInDollars = amountInCents / 100.0;
    ws.Range["F11"].NumberValue = amountInDollars;
    ws.Range["F11"].NumberFormat = "#,##0.00";

    // ... hundreds more rows ...

    // Totals calculated via Excel formulas in template
    // Tool triggers formula recalculation at end
    workbook.CalculateAllValue();
}

Complexity Factors:

  1. Row Count: 200-300 rows in Accounting Data
  2. Mask Complexity: Some rows use 5-10 mask groups
  3. Conditional Logic: Different logic based on district characteristics
  4. Balance Handling: Special treatment for balance records vs. operating records
  5. Year-Specific Rules: Each fiscal year has unique account code mappings

Year-over-Year Migration

Documentation: /docs/AFR/

Each fiscal year folder contains:

  • README.md - Changes from previous year
  • Sample templates
  • Test cases
  • Mapping spreadsheets (old codes → new codes)

Example: FY2023 Changes

  • Added Page 10 for COVID-19 reporting
  • New fund codes: 280 (ESSER), 285 (GEER)
  • Modified Page 3-4 revenue categories
  • Updated balance sheet structure

Example: FY2025 Changes

  • Integrated food service into main AFR (removed separate report)
  • Modified Page 7 structure
  • Updated object code ranges for post-pandemic classifications
  • New school-level report template

School-Level Reporting

Complexity: High - generates separate workbook with per-school financial data

Structure:

  • One worksheet per school
  • Each worksheet contains:
    • School identification
    • Revenues by source
    • Expenditures by function
    • Expenditures by object
    • Fund balance information

Implementation:

void GenerateSchoolReport(Config config, List<NormalizedRecord> records)
{
    var workbook = new Workbook();

    foreach (var school in config.schools.Where(s => !s.isPrivate))
    {
        var worksheet = workbook.Worksheets.Add(school.schoolName);

        // Filter records for this school's unit code
        var schoolRecords = records
            .Where(r => r.UnitCode == school.unitCode)
            .ToList();

        // Populate worksheet with school-specific aggregations
        PopulateSchoolWorksheet(worksheet, schoolRecords, school);
    }

    // Apply macro if 26+ schools
    if (config.schools.Count >= 26)
    {
        ApplySchoolReportMacros(workbook);
        workbook.SaveToFile($"{year} School-Level Reporting.xlsm", ExcelVersion.Version2016);
    }
    else
    {
        workbook.SaveToFile($"{year} School-Level Reporting.xls", ExcelVersion.Version2003);
    }
}

Macro Functionality (26+ schools):

  • Navigation menu
  • Print multiple schools
  • Hide/show worksheets
  • Data validation
  • Note: Macros are VBA code embedded in template, not generated by C# tool

IDC (Indirect Cost) Reporting

Purpose: Calculate indirect cost rate for federal grant compliance

Formula:

Indirect Cost Rate = Total Indirect Costs / Total Direct Costs

Implementation:

  1. Aggregate indirect costs (typically function 2300-2600)
  2. Aggregate direct costs (typically function 1000-2200)
  3. Calculate percentage rate
  4. Populate IDC template
  5. Include prior year comparisons

Template Structure:

  • Current year calculations
  • 3-year historical trend
  • Justification notes
  • Signature lines

Dependencies on Spire.XLS

Core Operations Used:

  • File loading/saving (multiple Excel versions)
  • Worksheet access by name/index
  • Cell/range reading and writing
  • Formula recalculation (CalculateAllValue())
  • Number formatting
  • Conditional formatting clearing
  • Font and style management
  • Hidden worksheet handling
  • Workbook-level operations (add/remove sheets)
  • Macro preservation (for .xlsm files)

Specific Spire Features:

  • Support for Excel 97-2003 (.xls) format (required for AFR)
  • Support for Excel 2016+ (.xlsx, .xlsm) format (for school reports, IDC)
  • Formula preservation during cell operations
  • Number format string application
  • Date/time value handling
  • Large workbook performance

Testing

Test Project: tests/afr-templater.Tests/

Test Strategy:

  1. Unit Tests - Individual methods (CodeConverter, NormalizedRecordAggregator)
  2. Integration Tests - Full report generation with sample data
  3. Snapshot Tests - Compare generated reports to known-good outputs
  4. Regression Tests - Ensure changes don’t break previous years

Test Data:

  • sample_config.json - Representative district configurations
  • sample_normalized_records.csv - Curated financial data
  • expected_outputs/ - Known-good generated reports

Challenges:

  • Excel binary comparison (use snapshot serialization)
  • Floating-point precision in financial calculations
  • Year-specific logic coverage
  • Template version management

Migration Notes for AFR Templater

Straightforward Migrations

  • File I/O (loading/saving workbooks)
  • Basic cell reading/writing
  • Number formatting
  • Worksheet access
  • Formula recalculation

Moderate Complexity Migrations

  • NormalizedRecordAggregator (pure logic, no Excel dependencies)
  • CodeConverter system (pure logic)
  • Config parsing (JSON)
  • School categorization logic

High Complexity Migrations

  • FillAccountingData method (~500+ lines per year)
  • Template version handling (5+ years)
  • Macro preservation for .xlsm files
  • Conditional formatting clearing
  • Year-specific processor customizations

Aspose Equivalents Needed

  • Multi-version Excel format support (.xls, .xlsx, .xlsm)
  • Formula preservation and recalculation
  • Conditional formatting access and manipulation
  • Hidden worksheet handling
  • Number format string application
  • Macro-enabled workbook support

Phase 1: Foundation (8-10 hours)

  • CodeConverter and year-specific converters
  • NormalizedRecordAggregator (with comprehensive tests)
  • Config parsing
  • Aspose workbook loading/saving

Phase 2: Core Template Operations (10-15 hours)

  • Cell reading/writing with type handling
  • Number formatting
  • Worksheet access patterns
  • Formula recalculation

Phase 3: Simple Report Generation (10-15 hours)

  • Cover page population
  • School listing generation
  • Basic cell mapping system (CellInfo)

Phase 4: Accounting Data Population (15-20 hours per year)

  • Start with most recent year (FY2025)
  • FillAccountingData method
  • Integration with NormalizedRecordAggregator
  • Extensive testing against known outputs

Phase 5: Advanced Features (10-15 hours)

  • Desegregation report
  • Food service integration (FY2025+)
  • COVID-19 reporting (FY2023+)
  • Macro-enabled school reports

Phase 6: Historical Year Support (5-10 hours per year)

  • Migrate FY2024 processor
  • Migrate FY2023 processor
  • Migrate FY2022 processor
  • Consider: Do all years need migration, or only recent years?

Phase 7: IDC Reporting (5-8 hours)

  • IDC calculation logic
  • IDC template population

Total Estimated Effort: 70-110 hours (can be done in 7-11 work chunks of 5-10 hours each)


Spire.XLS Usage Patterns

License Management

Requirement: Licensed Spire.XLS with SPIRE_LICENSE_KEY environment variable

Activation:

using Spire.Xls;

// Set license (usually in Program.cs startup)
string licenseKey = Environment.GetEnvironmentVariable("SPIRE_LICENSE_KEY");
if (!string.IsNullOrEmpty(licenseKey))
{
    Spire.License.LicenseProvider.SetLicenseKey(licenseKey);
}

Without License: Generated files contain “Evaluation Warning” watermarks

Common Patterns

Pattern 1: Load, Modify, Save

var workbook = new Workbook();
workbook.LoadFromFile("template.xls");

var worksheet = workbook.Worksheets["Page 1"];
worksheet.Range["A1"].Value = "New Value";

workbook.SaveToFile("output.xls", ExcelVersion.Version2003);

Pattern 2: Multi-Workbook Operations

var source = new Workbook();
source.LoadFromFile("source.xls");

var destination = new Workbook();
destination.LoadFromFile("destination.xls");

// Copy value from source to destination
var sourceCell = source.Worksheets["Data"].Range["A1"];
var destCell = destination.Worksheets["Results"].Range["B2"];
destCell.Value = sourceCell.Value;

destination.SaveToFile("output.xls", ExcelVersion.Version2003);

Pattern 3: Formula Handling

var worksheet = workbook.Worksheets["Calculations"];

// Write formula
worksheet.Range["A1"].Formula = "=SUM(B1:B10)";

// Recalculate formulas
workbook.CalculateAllValue();

// Read formula result
double result = worksheet.Range["A1"].FormulaNumberValue;

Pattern 4: Range Operations

// Single cell
worksheet.Range["A1"].Value = "Hello";

// Named range
worksheet.Range["A1:D10"].NumberFormat = "#,##0.00";

// Cell-by-cell iteration
for (int row = 1; row <= 10; row++)
{
    for (int col = 1; col <= 5; col++)
    {
        var cell = worksheet.Range[row, col];
        cell.Value = $"R{row}C{col}";
    }
}

Pattern 5: Type Detection

var cell = worksheet.Range["A1"];

if (cell.HasFormula)
{
    string formula = cell.Formula;
    double result = cell.FormulaNumberValue;
}
else if (cell.IsBlank)
{
    // Cell is empty
}
else if (cell.IsDateTime)
{
    DateTime date = cell.DateTimeValue;
}
else if (!string.IsNullOrEmpty(cell.Value))
{
    string text = cell.Value;
    double number = cell.NumberValue;
}

Pattern 6: Number Formatting

// Currency format
worksheet.Range["A1"].NumberFormat = "$#,##0.00";

// Percentage format
worksheet.Range["A2"].NumberFormat = "0.00%";

// Date format
worksheet.Range["A3"].NumberFormat = "mm/dd/yyyy";

// Accounting format
worksheet.Range["A4"].NumberFormat = "_($* #,##0.00_);_($* (#,##0.00);_($* \"-\"??_);_(@_)";

Excel Version Support

Spire.XLS ExcelVersion Enum:

  • ExcelVersion.Version97to2003.xls (max 65,536 rows)
  • ExcelVersion.Version2007.xlsx (max 1,048,576 rows)
  • ExcelVersion.Version2010.xlsx
  • ExcelVersion.Version2013.xlsx, .xlsm
  • ExcelVersion.Version2016.xlsx, .xlsm

AFR Requirement: Must save as Excel 97-2003 (.xls) for compatibility with state systems

School Reports: Can use Excel 2016 (.xlsm) for macro support

Date Handling

Excel Date Serial Number:

  • Excel stores dates as numbers (serial days since 1/1/1900)
  • Excel has a bug: treats 1900 as a leap year (it wasn’t)
  • Serial number 1 = 1/1/1900
  • Serial number 2 = 1/2/1900
  • Serial number 60 = 2/29/1900 (doesn’t exist, but Excel thinks it does)
  • Serial number 61 = 3/1/1900

Spire Date Conversion:

// Read date
DateTime date = cell.DateTimeValue;

// Write date (Spire handles serial conversion)
cell.DateTimeValue = new DateTime(2023, 12, 31);

// Manual conversion (matching SCRIBE logic)
double DateToSerial(string dateStr)
{
    var parts = dateStr.Split('/');
    int month = int.Parse(parts[0]);
    int day = int.Parse(parts[1]);
    int year = int.Parse(parts[2]);

    var date = new DateTime(year, month, day);
    var baseDate = new DateTime(1900, 1, 1);

    int daysSince1900 = (date - baseDate).Days + 1; // +1 for Excel starting at 1

    // Account for Excel 1900 leap year bug
    if (date >= new DateTime(1900, 3, 1))
        daysSince1900++; // Add extra day for fake 2/29/1900

    return daysSince1900;
}

Performance Considerations

Large Workbooks:

  • Spire can be memory-intensive for large files
  • Recommendation: Process worksheets sequentially, dispose when done
  • Avoid loading multiple large workbooks simultaneously

Formula Calculation:

  • CalculateAllValue() can be slow on complex workbooks
  • Only call when necessary (typically once at end)
  • Can calculate at worksheet or cell level for targeted recalculation

Best Practices:

  1. Dispose workbooks when done: workbook.Dispose()
  2. Use specific worksheet access (by name) rather than iteration
  3. Batch cell operations when possible
  4. Minimize formula recalculations

Migration Considerations

Aspose vs. Spire API Differences

Expected API Mapping:

Spire.XLS Aspose.Cells (Expected)
Workbook Workbook
Worksheet Worksheet
CellRange Cell / Range
Range["A1"] Cells["A1"] or Range["A1"]
Value property Value property
NumberValue property DoubleValue or similar
Formula property Formula property
HasFormula property IsFormula property
CalculateAllValue() CalculateFormula()
LoadFromFile(path) new Workbook(path)
SaveToFile(path, version) Save(path, format)
NumberFormat Style.Custom or NumberFormat

Note: Actual Aspose API may differ; this is based on typical patterns in Excel libraries.

Key Migration Challenges

1. API Surface Area Translation

  • Challenge: 1:1 mapping of all Spire operations to Aspose
  • Mitigation: Create abstraction layer (like ISpireHelper) that can be implemented with Aspose
  • Effort: Medium (can reuse existing abstraction pattern from SCRIBE)

2. Excel Date Handling

  • Challenge: Excel 1900 leap year bug handling
  • Mitigation: Port exact date conversion logic from SpireHelper.SetCellValue()
  • Effort: Low (logic is well-documented in SCRIBE)

3. Formula Preservation

  • Challenge: Ensuring formulas remain intact during cell operations
  • Mitigation: Test extensively with AFR templates (heavy formula usage)
  • Effort: Medium (requires thorough testing)

4. Number Format Strings

  • Challenge: Excel number format syntax must be preserved
  • Mitigation: Aspose likely supports same format strings as Spire
  • Effort: Low (format strings are Excel-standard)

5. Macro-Enabled Workbooks

  • Challenge: Preserving VBA macros in .xlsm files (school reports 26+ schools)
  • Mitigation: Aspose should support macro preservation; verify in early testing
  • Effort: Medium (need to test macro preservation)

6. Multi-Version Excel Format Support

  • Challenge: AFR requires .xls (97-2003), school reports use .xlsx/.xlsm
  • Mitigation: Aspose likely supports all versions; verify in early testing
  • Effort: Low (Aspose is mature library)

7. Business Logic Complexity

  • Challenge: AFR Templater contains years of accumulated business logic
  • Mitigation: Migration can be done gradually, year-by-year
  • Effort: High (requires deep understanding of each year’s requirements)

8. Testing and Validation

  • Challenge: Ensuring output parity with C# tools
  • Mitigation: Comprehensive snapshot testing with known-good outputs
  • Effort: High (critical for financial reporting accuracy)

Create Aspose Adapter:

interface IExcelLibrary {
    // File operations
    loadWorkbook(path: string): Workbook
    saveWorkbook(workbook: Workbook, path: string, format: ExcelFormat): void

    // Cell operations
    getCellValue(workbook: Workbook, worksheet: string, cell: string): CellValue
    setCellValue(
        workbook: Workbook,
        worksheet: string,
        cell: string,
        value: CellValue
    ): void

    // Formula operations
    calculateFormulas(workbook: Workbook): void

    // Type detection
    getCellType(workbook: Workbook, worksheet: string, cell: string): CellType

    // Formatting
    setNumberFormat(
        workbook: Workbook,
        worksheet: string,
        cell: string,
        format: string
    ): void
}

Benefits:

  1. Isolates Aspose-specific code
  2. Enables unit testing with mocks
  3. Allows future library swaps if needed
  4. Mirrors existing ISpireHelper pattern from SCRIBE

Incremental Migration Strategy

Recommended Approach: Bottom-Up Migration

  1. Start with SCRIBE (weeks 1-3)

    • Simpler tool, fewer business logic dependencies
    • Establishes Aspose abstraction layer
    • Validates Excel operation patterns
    • Tests instruction parsing/execution framework
  2. Migrate Core AFR Utilities (weeks 4-5)

    • CodeConverter and year-specific converters (pure logic, no Excel)
    • NormalizedRecordAggregator (pure logic, no Excel)
    • Config parsing (JSON, no Excel)
  3. Migrate Most Recent AFR Year (weeks 6-9)

    • FY2025 or FY2026 (current year)
    • Full template population logic
    • Extensive testing against known outputs
    • Establishes pattern for other years
  4. Migrate Additional Years (weeks 10-18)

    • FY2024, FY2023, FY2022 (as needed)
    • Each year ~1-2 weeks depending on customization
    • Consider: Only migrate years actively used
  5. Advanced Features (weeks 19-22)

    • School-level reporting
    • Desegregation reports
    • Food service integration
    • IDC reports
    • Macro-enabled workbooks
  6. Validation and Cleanup (weeks 23-24)

    • Side-by-side comparison with C# tools
    • Performance testing
    • Documentation
    • Deprecate C# tools

Total Timeline: ~24 weeks at 5-10 hours/week = 120-240 hours

Can be broken into: 12-24 work sessions of 5-10 hours each


Testing Requirements

Test Data Requirements

For SCRIBE:

  • Sample .scribe instruction files
  • Small Excel templates (5-10 worksheets)
  • Expected output files
  • Edge cases: empty cells, formula cells, date cells, large numbers

For AFR Templater:

  • Representative district configs (various sizes, types)
  • Comprehensive normalized_records.csv (all fund/function/object combinations)
  • AFR templates for each fiscal year
  • Known-good output AFRs from production use

Validation Approach

Option 1: Snapshot Testing

  • Serialize generated Excel files to JSON
  • Compare JSON snapshots to known-good outputs
  • Tools: ExcelSerializer pattern from existing tests

Option 2: Cell-by-Cell Comparison

  • Load both C# output and Node.js output
  • Compare cell values, formulas, formats
  • Report differences with cell addresses

Option 3: Checksums

  • Generate checksums of specific cells (anchor cells)
  • Compare checksums between C# and Node.js outputs
  • Fast, but less detailed than full comparison

Recommended: Combination of snapshot testing (comprehensive) and checksum validation (fast)

Critical Test Cases

SCRIBE:

  1. Load and save workbook (no modifications)
  2. Copy/paste single cells (all types)
  3. Copy/paste ranges
  4. Write values (all 8 types)
  5. Arithmetic operations (ADD_CELL, ADD_RAW, MULTIPLY_RANGE)
  6. Formula calculation
  7. Date conversion accuracy
  8. Empty/blank cell handling
  9. Error handling (invalid paths, missing worksheets, etc.)

AFR Templater:

  1. Per-Year Test Suite:

    • Generate AFR for single-school district
    • Generate AFR for multi-school district (2-25 schools)
    • Generate desegregation report (if applicable)
    • Generate food service report (pre-FY2025)
    • Generate school-level report (2-25 schools)
    • Generate school-level report (26-45 schools, macro-enabled)
    • Generate IDC report
  2. Cross-Year Tests:

    • Same district data processed with different year templates
    • Validate year-specific customizations
  3. Edge Cases:

    • Districts with no desegregation funds
    • Districts with no food service
    • Type 3 CTDS (JTED)
    • Private schools in roster
    • Single-school districts
    • Districts with exactly 25, 26, 45 schools (boundary conditions)
  4. Data Validation:

    • Negative amounts (credits, reversals)
    • Zero amounts
    • Very large amounts (>$1 billion)
    • Missing required fund codes
    • Invalid unit codes
    • Unbalanced balance sheet

Regression Testing

After Each Migration Phase:

  1. Run full test suite for migrated components
  2. Compare outputs to C# tool outputs
  3. Validate that no regressions in unmigrated components
  4. Update test snapshots if expected changes

Continuous Validation:

  • Run tests on every commit
  • Automated comparison between C# and Node.js outputs
  • Performance benchmarking (ensure Node.js not significantly slower)

Appendix: Code Locations

SCRIBE

Component Path
Main Entry src/Scribe/Program.cs
Project File src/Scribe/Scribe.csproj
Instruction Parser src/Scribe/Instructions/InstructionParser.cs
Instruction Invoker src/Scribe/Instructions/InstructionInvoker.cs
Base Instruction src/Scribe/Instructions/BaseInstruction.cs
Load Instruction src/Scribe/Instructions/Load/LoadInstruction.cs
Save Instruction src/Scribe/Instructions/Save/SaveInstruction.cs
Write Instruction src/Scribe/Instructions/Write/WriteInstruction.cs
Copy Instruction src/Scribe/Instructions/Copy/CopyInstruction.cs
Paste Instruction src/Scribe/Instructions/Paste/PasteInstruction.cs
Copy Range Instruction src/Scribe/Instructions/CopyRange/CopyRangeInstruction.cs
Paste Range Instruction src/Scribe/Instructions/PasteRange/PasteRangeInstruction.cs
Calculate Instruction src/Scribe/Instructions/Calculate/CalculateInstruction.cs
Add Cell Instruction src/Scribe/Instructions/AddCell/AddCellInstruction.cs
Add Raw Instruction src/Scribe/Instructions/AddRaw/AddRawInstruction.cs
Multiply Range Instruction src/Scribe/Instructions/MultiplyRange/MultiplyRangeInstruction.cs
Dump Instruction src/Scribe/Instructions/Dump/DumpInstruction.cs
Spire Helper src/Scribe/Utilities/SpireHelper.cs
Scribe Store src/Scribe/Utilities/ScribeStore.cs
File Handling src/Scribe/Utilities/FileHandling.cs
Exceptions src/Scribe/Utilities/Exceptions.cs
Timer src/Scribe/Utilities/Timer.cs
Tests tests/Scribe.Tests/
Dockerfile scribe.Dockerfile

AFR Templater

Component Path
Main Entry src/afr-templater/Program.cs
Project File src/afr-templater/afr-templater.csproj
Processor Factory src/afr-templater/processors/ProcessorFactory.cs
Base Processor src/afr-templater/processors/Processor.cs
FY2022 Processor src/afr-templater/processors/FiscalYear2022Processor.cs
FY2023 Processor src/afr-templater/processors/FiscalYear2023Processor.cs
FY2024 Processor src/afr-templater/processors/FiscalYear2024Processor.cs
FY2025 Processor src/afr-templater/processors/FiscalYear2025Processor.cs
FY2026 Processor src/afr-templater/processors/FiscalYear2026Processor.cs
Base Code Converter src/afr-templater/processors/converters/CodeConverter.cs
FY2022 Code Converter src/afr-templater/processors/converters/FiscalYear2022CodeConverter.cs
FY2023 Code Converter src/afr-templater/processors/converters/FiscalYear2023CodeConverter.cs
FY2024 Code Converter src/afr-templater/processors/converters/FiscalYear2024CodeConverter.cs
FY2025 Code Converter src/afr-templater/processors/converters/FiscalYear2025CodeConverter.cs
FY2026 Code Converter src/afr-templater/processors/converters/FiscalYear2026CodeConverter.cs
Code Converter Factory src/afr-templater/processors/converters/CodeConvertorFactory.cs
Normalized Record Aggregator src/afr-templater/utilities/NormalizedRecordAggregator.cs
Config Models src/afr-templater/models/
Tests tests/afr-templater.Tests/
Dockerfile Dockerfile
Documentation docs/AFR/

Shared

Component Path
Solution File nextgen_cs.sln
Spire.XLS Library lib/Spire.XLS.dll
Test Helpers tests/TestHelpers/
CI Dockerfile ci.Dockerfile, ci.scribe.Dockerfile
Tilt Config Tiltfile
GitHub Actions .github/workflows/pr.yml

Conclusion

This document provides comprehensive context for planning the migration of C# Excel manipulation tools to Node.js with Aspose. The migration can be approached incrementally in 5-10 hour work chunks, starting with SCRIBE (simpler, foundational) and progressing to AFR Templater (complex, business-critical).

Key Takeaways:

  1. Two Distinct Tools:

    • SCRIBE: Generic Excel automation (lower complexity)
    • AFR Templater: Specialized financial reporting (high complexity)
  2. Core Dependencies:

    • Spire.XLS API (must map to Aspose equivalents)
    • Excel format support (.xls, .xlsx, .xlsm)
    • Formula handling and date conversion
  3. Migration Complexity:

    • SCRIBE: ~3-4 weeks (15-20 hours)
    • AFR Templater: ~20-22 weeks (100-120 hours)
    • Total: ~23-26 weeks (115-140 hours) at 5-10 hours/week
  4. Critical Success Factors:

    • Robust Aspose abstraction layer
    • Comprehensive snapshot testing
    • Year-by-year AFR migration approach
    • Side-by-side validation with C# outputs
  5. Business Logic Preservation:

    • NormalizedRecordAggregator mask system (complex but well-defined)
    • CodeConverter year-specific rules (well-documented)
    • FillAccountingData method (most complex, needs careful migration)

Next Steps for Planning Agent:

  1. Prioritize migration order (recommend SCRIBE first)
  2. Break down each phase into 5-10 hour work chunks
  3. Define acceptance criteria for each chunk
  4. Identify dependencies between chunks
  5. Propose testing strategy for each chunk
  6. Consider: Which fiscal years need migration? (may not need all 5 years)

Documentation Sources:

  • /docs/AFR/ - Year-by-year change documentation
  • Test files - Concrete examples of inputs and expected outputs
  • Existing C# code - Well-structured with clear separation of concerns

This migration is feasible and can be approached systematically. The existing C# codebase is well-architected, which will facilitate translation to Node.js/Aspose.