Updated the load_cluster function to enhance parallel processing by committing the table creation before dispatching all files to worker processes. This change allows for more efficient handling of large datasets by reducing the serial workload and ensuring schema compatibility checks can access the committed table. The logic for streaming files has been clarified, maintaining progress tracking throughout the loading process. |
||
|---|---|---|
| generic_loader | ||
| utils | ||
| .gitignore | ||
| requirements.txt | ||