foxtrot

Author	SHA1	Message	Date
David Peterson	f84e127796	Update type inference behavior in load_sas.py to scan entire files by default Changed the default setting for TYPE_INFERENCE_SAMPLE_ROWS to None, allowing type and nullability inference to consider all rows in a SAS file. This adjustment ensures accurate handling of null values and integer ranges, addressing issues observed in production with large datasets. Updated documentation to reflect the implications of this change and the risks associated with using an integer cap for sampling.	2026-04-20 20:43:27 -05:00
michael-corey	2390ce1e0c	adding explorer	2026-04-20 16:27:54 -05:00
David Peterson	b78f6d648f	Enhance file clustering by implementing numeric sorting for last digit groups in stems and updating documentation for embedded-digit handling in auto-detection.	2026-04-20 11:48:22 -05:00
michael-corey	b3d7a9d440	adding index field	2026-04-20 10:18:09 -05:00
michael-corey	0d955eeab1	adding partition flag	2026-04-20 09:56:00 -05:00
michael-corey	e39eb47a90	altering such that commit is by batch	2026-04-20 08:38:38 -05:00
michael-corey	2d95711d9d	Updating python reference	2026-04-18 13:43:29 -05:00
michael-corey	f101eacffd	Merging main	2026-04-18 13:39:37 -05:00
michael-corey	edb9146682	moving files	2026-04-18 13:35:32 -05:00
michael-corey	1bbe0d4cd6	removing latin encoding, adding usage notes	2026-04-18 13:06:01 -05:00
David Peterson	c1e1fec10b	Update requirements.txt to support new package versions and add boto3 dependency	2026-04-18 12:41:02 -05:00
michael-corey	3b913b2ca6	adding user prompt for db creds	2026-04-18 12:37:22 -05:00
David Peterson	5b48872dd7	Add generate_sample_folder.py and load_folder.py for clustered SAS file generation and loading Introduce generate_sample_folder.py to create a test folder with clustered SAS XPORT files, including configurations for schema compatibility checks. Implement load_folder.py to facilitate loading entire directories of SAS files into Postgres, supporting explicit and auto-detect clustering. Update sample_folder_config.yaml for usage examples and configuration structure. Enhance load_sas.py with a public schema compatibility check function for orchestrators.	2026-04-18 11:25:04 -05:00
michael-corey	6b12ab969b	adding file_viewer	2026-04-18 11:19:38 -05:00
David Peterson	5645ff5597	Update load_sas.py to support streaming data loads with iter_sas_chunks and copy_dataframes. Enhance documentation for schema inference and type detection, clarifying the use of read_sas_preview and the implications of sampling. Add __pycache__ to .gitignore.	2026-04-18 10:44:32 -05:00
David Peterson	3a0537270c	Implement type inference sampling in load_sas.py to improve performance on large SAS files. Introduce TYPE_INFERENCE_SAMPLE_ROWS to limit the number of rows scanned for type detection while ensuring nullability checks cover the entire column. Update documentation to reflect these changes.	2026-04-18 10:28:37 -05:00
David Peterson	4f7ded09c6	Enhance load_sas.py with detailed usage instructions, YAML config structure, and command-line interface documentation for loading SAS files.	2026-04-18 10:20:07 -05:00
michael-corey	f681f1012a	Adding generic loader	2026-04-18 09:34:48 -05:00

18 Commits