Post

Streamlining File Transfers with VxFlow: A Simple Solution for Secure and Automated File Management

Streamlining File Transfers with VxFlow: A Simple Solution for Secure and Automated File Management

Managing file transfers between servers can often be a complex task, especially when dealing with constraints like one-way SSH access or ensuring data integrity.

Problem Statement

I recently tackled an interesting challenge involving two servers: Server-A and Server-B. While Server-B could connect to Server-A via SSH, the reverse was not possible. On Server-A, a service was generating files based on messages from an MQ queue, but I had no direct access to modify this service. I needed to move these files reliably to Server-B, ensuring data integrity, avoiding duplicates, and handling potential errors in the process.

Overview

To solve this, I developed two Bash scripts: one for each server. Here’s how the process works:

  1. On Server-A:
    • A script periodically moves files from the service’s output directory to a temporary directory.
    • It generates a checksum for the files and creates a special “wait” file to signal that the process is ongoing.
    • The script runs every minute but pauses if there are files already in the temporary directory, ensuring a consistent state.
  2. On Server-B:
    • A script checks the temporary directory on Server-A via SSH for the presence of the “wait” file.
    • If the “wait” file is present, it exits, as this indicates that the transfer on Server-A is not yet complete.
    • If the “wait” file is absent, it copies all files from Server-A’s temporary directory to its own temporary directory on Server-B.
    • After the files are transferred, the script calculates a checksum and compares it with the checksum file from Server-A.
    • If the checksums match, the files are moved to their final destination on Server-B, and the temporary directory on Server-A is cleaned up.
    • If the checksums do not match, the script triggers a recalculation process on Server-A and exits.

Diagram

VxFlow Diagram

Ensuring Data Integrity with Checksum Verification

To prevent data corruption or incomplete transfers, the solution uses checksum verification at every step. The process involves:

  1. Calculating a checksum for the files on Server-A after preparation.
  2. Recalculating the checksum on Server-B after transferring the files.
  3. Comparing the two checksums to confirm data integrity.
  4. If the checksums mismatch, the script triggers a recalculation on Server-A and retries the transfer, ensuring accuracy.

Automation and Scheduling

The entire process is automated using cron jobs. The script on Server-A runs every minute to prepare files, while the script on Server-B runs periodically to transfer and verify them. This ensures seamless, ongoing file transfers without manual intervention, making the workflow highly efficient.

This two-script solution demonstrates how you can automate file transfers and ensure data integrity using lightweight, open tools. By addressing constraints like one-way SSH access and service limitations, I was able to build a robust system that could handle errors gracefully and operate without manual supervision.

Sharing

To make it accessible to others, I’ve uploaded both scripts to GitHub. The repository includes:

  • The Server-A script for file preparation, checksum generation, and signaling readiness.
  • The Server-B script for transferring files, verifying data integrity, and cleaning up temporary directories.
  • A detailed README file with instructions on how to set up and customize the scripts for your own environment.
This post is licensed under CC BY 4.0 by the author.