Supersonic Uploading Photos to the Cloud Using Native NSInputStream



    The fastest possible downloading of photos and videos from the device to the server was our top priority when developing the Cloud Mail.Ru mobile application for iOS . In addition, from the very first version of the application, we provided users with the opportunity to enable automatic downloading to the server of all the contents of the system gallery. This is very convenient for those who are worried about the possible loss of the phone, however, as you know, it increases the amount of data transmitted at times.

    So, we set ourselves the task of making uploading photos and videos from the Mail.Ru Cloud mobile application not just good, but close to ideal. The result is our POSInputStreamLibrary library, which implements streaming uploads to the network of photos and videos from the iOS system gallery. Due to its close integration with the ALAssetLibrary and CFNetwork frameworks, loading in the application is very fast and does not require a single byte of free space on the device. I will talk about the implementation of my own descendant of the NSInputStream class from the iOS Developer Library in this post.

    During the service for the benefit of the Mail.Ru Cloud, the POSBlobInputStream stream has gained a very rich functionality:

    • url stream initialization ALAsset
    • support for synchronous and asynchronous operation modes
    • automatic reinitialization after object invalidation ALAsset
    • cache reading data from ALAsset
    • the ability to specify the offset from which reading will begin
    • ability to integrate with any data source

    The meaning of each of these possibilities is explained in a separate paragraph. Before considering them, it remains only to say that the source code of the library is available here , as well as in the main CocoaPods repository .

    Initializing a URL Flow ALAsset


    As long as all the functionality of the application was limited only to downloading photos, everything was simple. The image from the gallery was saved in a temporary file, on the basis of which a standard file stream was created. The latter was fed to the input NSURLRequestfor streaming into the network.

    @interface NSInputStream (NSInputStreamExtensions)
    // ...
    + (id)inputStreamWithFileAtPath:(NSString *)path;
    // ...
    @end
    

    @interface NSMutableURLRequest (NSMutableHTTPURLRequest)
    // ...
    - (void)setHTTPBodyStream:(NSInputStream *)inputStream;
    // ...
    @end
    

    Clickable: The requirement to support uploading video files made this approach unusable. The sheer size of the videos caused the following problems:





    • loading required a large amount of free space on the device
    • time to save video to a temporary file could reach 10 or more minutes

    To overcome these inconveniences, a class was developed POSBlobInputStream. It is initialized with the URL of the gallery object and reads the data directly without creating temporary files.

    @interface NSInputStream (POS)
    + (NSInputStream *)pos_inputStreamWithAssetURL:(NSURL *)assetURL;
    + (NSInputStream *)pos_inputStreamWithAssetURL:(NSURL *)assetURL asynchronous:(BOOL)asynchronous;
    + (NSInputStream *)pos_inputStreamForCFNetworkWithAssetURL:(NSURL *)assetURL;
    @end
    

    Clickable: At first, I had the feeling that the implementation of POSBlobInputStream would take a minimum of time, since the interface of its base class is trivial.





    @interface NSInputStream : NSStream
    - (NSInteger)read:(uint8_t *)buffer maxLength:(NSUInteger)len;
    - (BOOL)getBuffer:(uint8_t **)buffer length:(NSUInteger *)len;
    - (BOOL)hasBytesAvailable;
    @end
    

    Moreover, according to the documentation , it getBuffer:length:is not necessary to support, so it would seem that you need to implement only 2 methods. Their display on the interface of ALAssetRepresentationquestions also did not cause.

    @interface ALAssetRepresentation : NSObject
    // ...
    - (long long)size;
    - (NSUInteger)getBytes:(uint8_t *)buffer fromOffset:(long long)offset length:(NSUInteger)length error:(NSError **)error;
    // ...
    @end
    

    However, having lowered the newly made POSBlobInputStreamto the water, I was unpleasantly surprised. A call to any method of the NSStream base class ended with an exception of the form:
    *** -propertyForKey: only defined for abstract class.  Define -[POSBlobInputStream propertyForKey:]
    

    The reason is that it NSInputStreamis an abstract class, and each of its init methods creates an object of one of the derived classes. In Objective-C, this pattern is called class cluster . Thus, the implementation of its own flow requires the implementation of all methods NSStream, including a room full of them.

    @interface NSStream : NSObject
    - (void)open;
    - (void)close;
    - (id )delegate;
    - (void)setDelegate:(id )delegate;
    - (id)propertyForKey:(NSString *)key;
    - (BOOL)setProperty:(id)property forKey:(NSString *)key;
    - (void)scheduleInRunLoop:(NSRunLoop *)aRunLoop forMode:(NSString *)mode;
    - (void)removeFromRunLoop:(NSRunLoop *)aRunLoop forMode:(NSString *)mode;
    - (NSStreamStatus)streamStatus;
    - (NSError *)streamError;
    @end
    

    Synchronous and asynchronous POSBlobInputStream operating modes


    During development, the POSBlobInputStreammost difficult was to implement a mechanism for asynchronous notification of a state change. The NSStreammethods are responsible for it scheduleInRunLoop:forMode:, removeFromRunLoop:forMode:and setDelegate:. Thanks to them, you can create flows that at the time of opening do not have a single byte of information. POSBlobInputStreamexploits this opportunity for the following purposes:

    • Implementing a non-blocking version of the method open. POSBlobInputStreamconsidered open as soon as he managed to get an object ALAssetRepresentationfrom him NSURL. As you know, using the iOS SDK this can only be done asynchronously. Thus, the presence of a mechanism for asynchronous notification of a change in the state of a stream from NSStreamStatusNotOpento NSStreamStatusOpenor NSStreamStatusErrorhere is most welcome.
    • Informing whether the data stream has read data by sending an event NSStreamEventHasBytesAvailable.

    For illustrative purposes, the following are implementations of file checksum calculation using POSBlobInputStream. Let's start by considering the synchronous option.

    NSInputStream *stream = [NSInputStream pos_inputStreamWithAssetURL:assetURL asynchronous:NO];
    [stream open];
    if ([stream streamStatus] == NSStreamStatusError) {
        /* Информируем об ошибке */
        return;
    }
    NSParameterAssert([stream streamStatus] == NSStreamStatusOpen);
    while ([stream hasBytesAvailable]) {
        uint8_t buffer[kBufferSize];
        const NSInteger readCount = [stream read:buffer maxLength:kBufferSize];
        if (readCount < 0) {
            /* Информируем об ошибке */
            return;
        } else if (readCount > 0) {
            /* Логика подсчета контрольной суммы */
        }
    }
    if ([stream streamStatus] != NSStreamStatusAtEnd) {
        /* Информируем об ошибке */
        return;
    }
    [stream close];
    

    For all its simplicity, this code has one invisible feature. If you execute it in the main thread, deadlock will occur. The fact is that the open method blocks the calling thread until the iOS SDK returns in the main thread ALAsset. If the function openitself is called in the main thread, it will result in a classic deadlock. Why did you need a synchronous implementation of the stream at all, will be described below in the section “Features of integration with NSURLRequest”.
    Asynchronous version of checksum calculation looks a bit more complicated.

    @interface ChecksumCalculator () 
    @end
    @implementation ChecksumCalculator
    - (void)calculateChecksumForStream:(NSInputStream *)aStream {
        aStream.delegate = self;
        [aStream open];
        dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
                NSRunLoop *runLoop = [NSRunLoop currentRunLoop];
                [aStream scheduleInRunLoop:runLoop forMode:NSDefaultRunLoopMode];
                for (;;) { @autoreleasepool {
                    if (![runLoop runMode:NSDefaultRunLoopMode
                                     beforeDate:[NSDate dateWithTimeIntervalSinceNow:kRunLoopInterval]]) {
                        break;
                    }
                    const NSStreamStatus streamStatus = [aStream streamStatus];
                    if (streamStatus == NSStreamStatusError || streamStatus == NSStreamStatusClosed) {
                        break;
                    }
                }}
        });
    }
    #pragma mark - NSStreamDelegate
    - (void)stream:(NSStream *)aStream handleEvent:(NSStreamEvent)eventCode {
        switch (eventCode) {
            case NSStreamEventHasBytesAvailable: {
                [self updateChecksumForStream:aStream];
            } break;
            case NSStreamEventEndEncountered: {
                [self notifyChecksumCalculationCompleted];
                [_stream close];
            } break;
            case NSStreamEventErrorOccurred: {
                [self notifyErrorOccurred:[_stream streamError]];
                [_stream close];
            } break;
        }
    }
    @end
    

    ChecksumCalculatorsets itself as an event handler POSBlobInputStream. As soon as the stream has new data, or, conversely, ends, or an error occurs, it sends the corresponding events. Please note that it is possible to specify which thread to send them to. For example, in the above code listing, they will come into a certain workflow created by GCD.

    Features of integration with ALAssetLibrary


    When working with ALAssetLibrary, the following should be considered:

    • Method calls ALAssetRepresentationare very expensive. POSBlobInputStreamtries to minimize their number by caching the results. For example, there is a minimal data block that will be subtracted when the method is called read:maxLength:, and only after its exhaustion will a new call occur.
    • ALAssetRepresentationmay become invalid. So, on iOS 5.x, this happens when saving a photo to the phone’s gallery. From the point of view of client code, this looks like returning a null value using an getBytes:fromOffset:length:error:object method ALAssetRepresentation. At the same time, it is known that the data has not been completely read. In this case, POSBlobInputStreamreceives ALAssetRepresentationagain. It will be useful to note that when operating in synchronous mode, the calling thread is blocked during reinitialization, but not in asynchronous mode.


    Features of integration with NSURLRequest


    The implementation of the network level of the iOS SDK in general and NSURLRequestin particular is based on the CFNetwork framework. Over the long years of his life, he has accumulated many cabinets with skeletons. But first things first.

    NSInputStreamis one of the " toll-free bridged " iOS SDK classes. You can cast it to CFReadStreamRef and work with it later as an object of this type. This property underlies the implementation NSURLRequest. The latter pretends POSBlobInputStreamto be his twin brother, and CFNetwork communicates with him already using the C-interface. In theory, all C-calls to CFReadStreamshould be proxied to calls to their corresponding methods NSInputStream. However, in practice there are two serious deviations:

    1. Not all calls are proxied. For some, this procedure has to be done independently. I will not dwell on this, since there are good articles on the Internet on this topic: How to implement a CoreFoundation toll-free bridget NSInputStream , Subclassing NSInputStream .
    2. Proxying CFReadStreamGetError crashes the application. This exclusive knowledge was obtained by analyzing the crash logs of the application and meditating on the CFStream sources . Apparently, for this reason, the specified function is marked outdated in the documentation, but, nevertheless, its use has not yet been eradicated from all places of CFNetwork. So, every time NSInputStreamCFNetwork informs about an error, the framework tries to get its description using this ill-fated function. The result is sad.

    There are not many options to deal with the second problem. Since it is impossible to refactor CFNetwork, it remains only not to provoke it to hostile actions. To prevent CFNetwork from trying to get a description of the error, do not under any circumstances notify him of its occurrence. For this reason, POSBlobInputStreamgot a property shouldNotifyCoreFoundationAboutStatusChange. If the flag is set, then:

    1. the thread will not send notifications of changes in its status through C callbacks
    2. method streamStatuswill never return valueNSStreamStatusError

    The only way to find out about an error occurring when the flag is raised is to implement a protocol with a certain class NSStreamDelegateand set it as a delegate to the stream (see the checksum calculation example above).

    Another unpleasant discovery was that CFNetwork works with a stream in synchronous mode. Despite the fact that the framework subscribes to notifications, for some reason it is still engaged in its poll-ing. For example, a method openis called several times in a loop, and if the thread does not manage to go into the open state during this time interval, it is recognized as corrupted. This feature of the network framework was the reason for supporting POSBlobInputStreamsynchronous operation, albeit with limitations.

    Read Offset Support


    Clouds Mail.Ru iOS application can upload files. This functionality allows you to save traffic and user time in the event that part of the downloaded file is already in storage. To implement this requirement, he POSBlobInputStreamwas trained to read the contents of a photograph not from the beginning, but from a certain position. The offset in it is set by the property NSStreamFileCurrentOffsetKey. Due to the fact that it is also used to shift the beginning of a standard file stream, it becomes possible to specify it in a uniform manner.

    Support for custom data sources


    POSBlobInputStreamwas created to download photos and videos from the gallery. However, it is designed so that, if necessary, other data sources could be used. For streaming from other sources, you must implement the protocol POSBlobInputStreamDataSource.

    @protocol POSBlobInputStreamDataSource 
    //
    // Self-explanatory KVO-compliant properties.
    @property (nonatomic, readonly, getter = isOpenCompleted) BOOL openCompleted;
    @property (nonatomic, readonly) BOOL hasBytesAvailable;
    @property (nonatomic, readonly, getter = isAtEnd) BOOL atEnd;
    @property (nonatomic, readonly) NSError *error;
    //
    // This selector will be called before anything else.
    - (void)open;
    //
    // Data Source configuring.
    - (id)propertyForKey:(NSString *)key;
    - (BOOL)setProperty:(id)property forKey:(NSString *)key;
    //
    // Data Source data.
    // The contracts of these selectors are the same as for NSInputStream.
    - (NSInteger)read:(uint8_t *)buffer maxLength:(NSUInteger)maxLength;
    - (BOOL)getBuffer:(uint8_t **)buffer length:(NSUInteger *)bufferLength;
    @end
    

    Properties are used not only to obtain the state of the data source, but also to inform the stream about its change using the KVO mechanism.

    Total


    During the work on the stream, I spent a lot of time on the network in search of any analogues. Firstly, I didn’t want to reinvent the wheel, and secondly, things are going much faster if you keep in mind a certain model. Unfortunately, I could not find good implementations. The scourge of most analogues is the implementation of asynchronous operation. In the best case, as in HSCountingInputStream , the internal object of one of the standard streams is used to dispatch events, which is incorrect. Often, asynchronous operation is not supported at all, as, for example, in NTVStreamMux :

    #pragma mark Undocumented but necessary NSStream Overrides (fuck you Apple)
    - (void) _scheduleInCFRunLoop:(NSRunLoop*) inRunLoop forMode:(id)inMode {
        /* FUCK YOU APPLE */
    }
    - (void) _setCFClientFlags:(CFOptionFlags)inFlags
                      callback:(CFReadStreamClientCallBack)inCallback
                       context:(CFStreamClientContext)inContext {
        /* NO SERIOUSLY, FUCK YOU */
    }
    

    POSBlobInputStream, in turn, is one of the key components of the Mail.Ru Cloud application. During the service, he was tested in battle by an army of users. A lot of rakes were collected and leveled, and at the moment the flow is one of the most stable components. Use, write extensions, and, of course, I will be glad of any feedback.

    Pavel Osipov,
    Head of the Cloud Development Team for iOS


    Also popular now: