Download manager project is a networking related project which is developed to help users to download any file from internet with a fast speed and from multiple servers.

Download Manager and Download Workers

As the figure suggests, for each file download, a CoreDownloadManager object is instantiated. A list of mirror URLs (can be just one) are fed to it. For each URL, it creates a CoreDownloadWorker, which acts as a proxy, abstracting the protocol specific implementation of IProtocolWrapper.

For example, there can be two mirror URLs, one using HTTP and the other FTP, both referring to copies of the same file. The ProtocolWrapperFactory holds references to protocol-specific implementations and dispenses them when requested through a static method. This is a typical use of the Factory pattern where specific subclasses (or protocol-specific implementations of IProtocolWrapper here) are returned based on the requested type.

The CoreDownloadWorker spawns a separate thread and invokes the getDownloadInfo() method on the protocol implementation. The protocol-specific implementation reads the specific headers to determine the file size and support for resuming and puts this information into the CoreDownloadWorker for this implementation. The protocol implementation is separate from the core package because it makes supporting multiple protocols easier, more like developing plug-ins. The HTTP support that comes with the default DoIt distribution, namely http_protocolwrapper. HTTP_ProtocolWrapper is actually a separate plug-in. The HTTP_ProtocolWrapper in turn relies on an open source library called HTTP Client, developed by Ronald Tschalär, to do the actual communication through HTTP.

The HTTP Protocol

Use the HTTP Head protocol to find out if the servers support resuming. The HTTP Head protocol has the same structure as the HTTP Get protocol, except that the server responds with just the header information, such as support for resuming, file size, and so forth. If the servers support resuming, then find the file size. This saves the user from downloading the entire file just to get the file information.

If the server responds with a ’206′ header, it means the server supports Partial Content; in other words, resuming is supported. The header also has a ‘Content Length’ field, which tells how big the file is. If, on the other hand, a ’200′ header is returned, it means the server does not support resuming. In such cases the file cannot be downloaded in parts. For more details of the HTTP 1.1 protocol, read RFC 2616*. Other configuration properties for HTTP, such as proxy settings, can be configured through the HTTP_ProtocolWrapper.properties file in the ‘pw‘ (short for Protocol Wrapper) directory.

After all the CoreDownloadWorkers have finished gathering information about the file, the CoreDownloadManager consolidates this information. If there is at least one CoreDownloadWorker that supports resuming, it will be chosen over the CoreDownloadWorkers that don’t. If none of them support resuming, then the first one will be chosen. If there are more than one CoreDownloadWorkers that support resuming, all of them will be used to balance the load. With this list of CoreDownloadWorkers, the old CoreDownloadWorkers are deleted and new ones created, one for each URL in the list just created. Since it is now known whether the URLs support resuming or not and the file length is also known, the CoreDownloadWorkers are configured to download a specific fragment of the file. If resuming is not supported, then a single CoreDownloadWorker will be set to download the entire file.

Downloading the File Fragments Simultaneously

The HTTP plug-in uses the ‘Get’ protocol to download the file. If resuming is supported, each of the CoreDownloadWorkers is configured to download a non-overlapping fragment of the file. This request is conveyed to the server(s) by setting the ‘Range’ field with the specific byte range of the file in the HTTP ‘Get’ request. For example, suppose one CoreDownloadWorker has to download the first 1024 bytes; the ‘Range’ value of the ‘Get’ request would be ‘bytes=0-1023′. DoIt is configured to download in chunks specified in the CHUNK_SIZE field of the DoIt.properties located in the DoIt directory (minimum of 50 KB) or one seventh of the file size, whichever is bigger.

The actual download process runs in a separate thread, which is controlled by the CoreDownloadManager. The bytes read from the protocol implementation continue to be saved into files specific to each CoreDownloadWorker. After all the CoreDownloadWorkers have finished downloading, the FileRecombiner in the sd.util sub package combines them into one file in the location initially specified by the user. The partial files downloaded by each CoreDownloadWorker are stored in a temporary directory inside the ‘dwnld‘ directory (short for Download), whose name is an MD5 hash of the first mirror URL typed in for the download. The MD5 algorithm implementation for Java used here was
To support resuming the download, the CoreDownloadManager and CoreDownloadWorkers can be serialized. Since they carry with them the URLs being used and the number of bytes copied until now, downloads can be stopped anytime and resumed later by just serializing and de-serializing the CoreDownloadManager, which in turn serializes and de-serializes the CoreDownloadWorkers. This serialized object graph is stored in a file called DownloadInfo.ser in the temporary directory allocated for this download.

Download Manager Project in java source code with  project report.