0

We are building a new application for a client to manage their cases. They are already using their existing system in which they are storing files associated to the cases in an FTP folder. There is an attachment table (~1M rows) which maps the cases with the ftp file location. As part of the migration the client also wants to move away from FTP to a cloud storage provider (Azure). Currently there is roughly 1TB of files in the FTP folder which we need to move to Azure.

Current architecture:

enter image description here

In the FTP there is no folder structure , they are just dumping the file and storing that link in the Attachment table. But in the Azure we would need to create a folder structure. Because of this we cannot just copy-paste the same files in Azure.

There are couple of approaches:

Option 1:

  1. Write a script in node.js which will read the case table, get all the assoicated rows from the attachment table for one case.
  2. Get the links from the attachment table.
  3. Make a ftp connection and get actual file using the links that were fetched from the previous step.
  4. Generate the folder structure in local system.
  5. Once all the files are retrieved then psuh the files into azure.
  6. Delete the folder structure
  7. Repeat the steps for the next Case.

Option 2:

  1. In this option we will run thru the same steps as before until 5.
  2. But we will not delete the folder structure instead we will build the folder strcuture for all the cases in the local machine.
  3. Deploy the files all at once into Azure.

It would really be helpful to understand what is the best approach that we can take? Are there any other approaches apart from the above.

Also, Option 1 could be run in parallel (multiple cases in one shot). What could be limitation in this? Option2 would required atleast 1.2 TB local space which is little hard to get considering the current logistical limitation in the company.

Sam
  • 121
  • 5
  • Uploading all at once would give you the benefit of being able to manually inspect the script's output before you moved it up to Azure. – GrandmasterB Jan 05 '19 at 08:49
  • why do you need a folder structure in Azure? – Ewan Jan 07 '19 at 06:00
  • @Ewan , Because that is how other systems are storing it and it is the accepted the structure in this project. – Sam Jan 09 '19 at 01:45

1 Answers1

2

It really depends on your requirements and constraints as you've mentioned here.

Option 1 (in naive approach if you are space-constraint, otherwise Option 2) is preferable for your case, since you shouldn't waste time on a one-shot throw-away migration code for a tiny-size file-server, unless you foresee that you will need to do it again multiple times in the future and for bigger file-servers with the same requirements (FTP -> Azure Storage) and scheme.

(+ in my opinion, getting a 2TB+ hard-drive in the current storage market shouldn't be an issue.)

There are many more approaches for this problem, such as uploading it incrementally (and switching the current db records to point to the Azure service after upload), making it parallel across systems as you mentioned etc.

However, I wouldn't sweat it; especially when it's for a specific client with such a specific scheme, I can't see how you would reuse your code in a later project with different scheme/requirements.

nadir
  • 773
  • 1
  • 4
  • 10
  • Thanks for the suggestion. But this a one time thing so code re-useability is not a concern here. – Sam Jan 16 '19 at 15:25