Modules in Detail

1. Document Detection

SBSDKDocumentDetector uses digital image processing algorithms to find rectangular, document like, polygons in a digital image.

- (SBSDKDocumentDetectorResult *)detectDocumentPolygonOnImage:(UIImage *)image
                                             visibleImageRect:(CGRect)visibleRect
                                             smoothingEnabled:(BOOL)smooth
                                   useLiveDetectionParameters:(BOOL)liveDetection

As input an UIImage object or a CMSampleBufferRef is accepted. Typically the camera API in Apples AVFoundation framework returns CMSampleBufferRef objects. This way you can easily detect document polygons on a live camera video stream or still image shot. Images from the photo library are usually converted to UIImage objects before being passed to the detector.

The visibleRect parameter lets you limit the detection area on the image. If you pass CGRectZero or an a rectangle with zero width or height the whole image is used for detection. The rectangle must be provided in unit coordinate system { 0.0f, 0.0f } - { 1.0f, 1.0f }, where { 0.0f, 0.0f } is the top left and { 1.0f, 1.0f } the bottom right corner of the image. The detector ignores edges that are outside the visibleRect.

The smooth parameter is typically used for realtime detection using the same detector instance all the time. When set to YES, consecutive detected document polygons within a certain timeframe are cumulated into a single dampened polygon. It prevents jumping polygons in situations where the recognized edges changes from video frame to video frame. If you use smoothing you should also observe the devices motion and clear the consecutive polygon buffer if significant motion was detected. This is done by calling the detectors -resetSmoothingData method.

The liveDetection flag you tpyically set to YES when you need the fastest detection possible for e.g. realtime detection. The results are a little bit less accurate than when set to NO but the performance significantly rises. With liveDetection enabled an iPhone 6 can detect up to 20 video frames per second, depending on the video frame resolution. It is recommended to turn off liveDetection if you want to detect only once on a static image.

The result contains an SBSDKPolygon (or nil if nothing was found) and an SBSDKDocumentDetectionStatus enum member. For user guidance during live detection the status can be used. It tells you whether the detected polygon is too small, the perspective is insufficient or the orientation is not optimal. If there was no polygon detected the status might contain information about why there was no polygon detected (too noisy background, too dark).

For manual realtime detection with the device’s camera also take a look at the following classes:

Example code for document detection on a video frame:

Objective-C:

- (void)processSampleBuffer:(CMSampleBufferRef)sampleBuffer videoOrientation:(AVCaptureVideoOrientation)orientation {

    // Create an SBSDKDocumentDetector.
    // Note: Usually you store it in a property, for demo purposes we create a new one for each frame.
    SBSDKDocumentDetector *detector = [[SBSDKDocumentDetector alloc] init];

    // Detect a document's outlines on the sample buffer
    SBSDKDocumentDetectorResult *result = [detector detectDocumentPolygonOnSampleBuffer:sampleBuffer
                                                                       visibleImageRect:CGRectZero
                                                                       smoothingEnabled:YES
                                                             useLiveDetectionParameters:YES];

    // If we have an acceptable polygon...
    if (result.status == SBSDKDocumentDetectionStatusOK && result.polygon != nil) {
        // We take a still shot from the camera.
        // Create a UIImage from the still shot sample buffer.
        // Warp the image into the polygon.
        [self.cameraSession captureStillImageWithCompletionHandler:^(CVPixelBufferRef pixelBuffer, NSError *error) {
            UIImage *image = [UIImage sbsdk_imageFromPixelBuffer:pixelBuffer];
            image = [image sbsdk_imageWarpedByPolygon:result.polygon imageScale:1.0];
        }];
    }
}

Swift:

func processSampleBuffer(_ sampleBuffer: CMSampleBuffer, videoOrientation orientation: AVCaptureVideoOrientation) {

    // Create an SBSDKDocumentDetector.
    // Note: Usually you store it in a property, for demo purposes we create a new one for each frame.
    let detector = SBSDKDocumentDetector()

    // Detect a document's outlines on the sample buffer
    let result = detector.detectDocumentPolygon(on: sampleBuffer,
                                                visibleImageRect: .zero,
                                                smoothingEnabled: true,
                                                useLiveDetectionParameters: true)
    // If we have an acceptable polygon...
    if let polygon = result.polygon, result.status == SBSDKDocumentDetectionStatusOK {
        // We take a still shot from the camera.
        // Create a UIImage from the still shot sample buffer.
        // Warp the image into the polygon.
        self.cameraSession.captureStillImage { pixelBuffer, error in
            if let pixelBuffer = pixelBuffer {
                let image = UIImage.sbsdk_image(from: pixelBuffer)?.sbsdk_imageWarped(by: polygon, imageScale: 1.0)
            }
        }
    }
}

2. User Interface for guided, automatic Document Scanning

See SBSDKScannerViewController.

For your convenience Scanbot SDK comes with a view controller subclass that handles all the camera and detection implementation details for you. Additionally it provides UI for Scanbots document scanning guidance as well as the functionality and UI for manual and automatic shutter release.

The controllers delegate can customize the appearance and behavior of the guidance UI. Further SBSDKScannerViewController gives its delegate control over how and when frames are analyzed and, most important, it delivers the scanned (and perspective corrected, cropped document) images to its delegate.

See SBSDKScannerViewControllerDelegate for customization of UI and behavior.

Example code

Objective-C:

#import "SBSDKScanbotSDK.h"

@interface DemoViewController : UIViewController
@end

@interface DemoViewController() <SBSDKScannerViewControllerDelegate>
@property (strong, nonatomic) SBSDKIndexedImageStorage *imageStorage;
@property (strong, nonatomic) SBSDKScannerViewController *scannerViewController;
@property (assign, nonatomic) BOOL viewAppeared;
@end

@implementation DemoViewController

- (void)viewDidLoad {
    [super viewDidLoad];

    // Create an image storage to save the captured document images to
    NSURL *imagesURL = [[SBSDKStorageLocation applicationDocumentsFolderURL]
                           URLByAppendingPathComponent:@"Images"];
    SBSDKStorageLocation *imagesLocation = [[SBSDKStorageLocation alloc] initWithBaseURL:imagesURL];
    self.imageStorage = [[SBSDKIndexedImageStorage alloc] initWithStorageLocation:imagesLocation];

    // Create the SBSDKScannerViewController.
    // We want it to be embedded into self.
    // As we do not want automatic image storage we pass nil here for the image storage.
    self.scannerViewController
    = [[SBSDKScannerViewController alloc] initWithParentViewController:self imageStorage:nil];

    // Set the delegate to self.
    self.scannerViewController.delegate = self;

    // We want unscaled images in full 5 or 8 MPixel size.
    self.scannerViewController.imageScale = 1.0f;
}

- (void)viewDidAppear:(BOOL)animated {
    [super viewDidAppear:animated];
    self.viewAppeared = YES;
}

- (void)viewWillDisappear:(BOOL)animated {
    [super viewWillDisappear:animated];
    self.viewAppeared = NO;
}

- (BOOL)shouldAutorotate {
    // No autorotations.
    return NO;
}

- (UIInterfaceOrientationMask)supportedInterfaceOrientations {
    // Only portrait.
    return UIInterfaceOrientationMaskPortrait;
}

- (UIStatusBarStyle)preferredStatusBarStyle {
    // White statusbar.
    return UIStatusBarStyleLightContent;
}

#pragma mark - SBSDKScannerViewControllerDelegate

- (BOOL)scannerControllerShouldAnalyseVideoFrame:(SBSDKScannerViewController *)controller {
    // We only want to process video frames when self is visible on screen and front most view controller
    return self.viewAppeared && self.presentedViewController == nil;
}

- (void)scannerController:(SBSDKScannerViewController *)controller
  didCaptureDocumentImage:(UIImage *)documentImage {
    // Here we get the perspective corrected and cropped document image after the shutter was (auto)released.
    // We store it into our image storage.
    [self.imageStorage addImage:documentImage];
}

- (void)scannerController:(SBSDKScannerViewController *)controller didCaptureImage:(UIImage *)image {
    // Here we get the full image from the camera. We could run another manual detection here or use the latest
    // detected polygon from the video stream to process the image with.
}

- (void)scannerController:(SBSDKScannerViewController *)controller
         didDetectPolygon:(SBSDKPolygon *)polygon
               withStatus:(SBSDKDocumentDetectionStatus)status {
    // Every time the document detector finishes detection it calls this delegate method.

}

- (UIView *)scannerController:(SBSDKScannerViewController *)controller
       viewForDetectionStatus:(SBSDKDocumentDetectionStatus)status {
    // Here we can return a custom view that we want to use to visualize the latest detection status.
    // We return nil for now to use the standard label.
    return nil;
}

- (UIColor *)scannerController:(SBSDKScannerViewController *)controller
polygonColorForDetectionStatus:(SBSDKDocumentDetectionStatus)status {
    // If the detector has found an acceptable polygon we show it using a green color
    if (status == SBSDKDocumentDetectionStatusOK) {
        return [UIColor greenColor];
    }
    // Otherwise red color.
    return [UIColor redColor];
}

@end

Swift:

class DemoViewController: SBSDKBaseCameraViewController {

    var imageStorage: SBSDKIndexedImageStorage?
    var scannerViewController: SBSDKScannerViewController?
    var isViewAppeared: Bool?

    override func viewDidLoad() {
        super.viewDidLoad()

        // Create an image storage to save the captured document images to
        let imagesURL = SBSDKStorageLocation.applicationDocumentsFolderURL().appendingPathComponent("Images")
        let imagesLocation = SBSDKStorageLocation(baseURL: imagesURL)
        self.imageStorage = SBSDKIndexedImageStorage(storageLocation: imagesLocation)

        // Create the SBSDKScannerViewController.
        // We want it to be embedded into self.
        // As we do not want automatic image storage we pass nil here for the image storage.
        self.scannerViewController = SBSDKScannerViewController(parentViewController: self, imageStorage: nil)

        // Set the delegate to self.
        self.scannerViewController?.delegate = self

        // We want unscaled images in full 5 or 8 MPixel size.
        self.scannerViewController?.imageScale = 1.0
    }

    override func viewDidAppear(_ animated: Bool) {
        super.viewDidAppear(animated)
        self.isViewAppeared = true
    }

    override func viewWillDisappear(_ animated: Bool) {
        super.viewWillDisappear(animated)
        self.isViewAppeared = false
    }

    override var shouldAutorotate: Bool {
        // No autorotations.
        return false
    }

    override var supportedInterfaceOrientations: UIInterfaceOrientationMask {
        // Only portrait.
        return .portrait
    }

    override var preferredStatusBarStyle: UIStatusBarStyle {
        // White statusbar.
        return .lightContent
    }
}

extension DemoViewController: SBSDKScannerViewControllerDelegate {
    func scannerControllerShouldAnalyseVideoFrame(_ controller: SBSDKScannerViewController) -> Bool {
        // We only want to process video frames when self is visible on screen and front most view controller
        return self.isViewAppeared == true && self.presentedViewController == nil
    }

    func scannerController(_ controller: SBSDKScannerViewController, didCaptureDocumentImage documentImage: UIImage) {
        // Here we get the perspective corrected and cropped document image after the shutter was (auto)released.
        // We store it into our image storage.
        self.imageStorage?.add(documentImage)
    }

    func scannerController(_ controller: SBSDKScannerViewController, didCapture image: UIImage) {
        // Here we get the full image from the camera. We could run another manual detection here or use the latest
        // detected polygon from the video stream to process the image with.
    }

    func scannerController(_ controller: SBSDKScannerViewController,
                           didDetect polygon: SBSDKPolygon?,
                           with status: SBSDKDocumentDetectionStatus) {
        // Every time the document detector finishes detection it calls this delegate method.
    }

    func scannerController(_ controller: SBSDKScannerViewController,
                           viewFor status: SBSDKDocumentDetectionStatus) -> UIView? {
        // Here we can return a custom view that we want to use to visualize the latest detection status.
        // We return nil for now to use the standard label.
        return nil
    }

    func scannerController(_ controller: SBSDKScannerViewController,
                           polygonColorFor status: SBSDKDocumentDetectionStatus) -> UIColor {
        // If the detector has found an acceptable polygon we show it using a green color
        if status == SBSDKDocumentDetectionStatusOK {
            return .green
        }
        // Otherwise red color.
        return .red
    }
}

See SBSDKScannerViewControllerDelegate for details.

3. Image Processing

See SBSDKImageProcessor.

Digital image processing is a core part of Scanbot SDK. Basically there are three operations on images:

  • Rotation
  • Image filtering
  • Image warping (perspective correction and cropping) into a 4-sided polygons shape

All of these image operations can be called either synchronously in any thread or queue or asynchronously on a special serial image processing queue. When working with large images it is highly recommended to make use of the asynchronous API as here no parallel processing of images is possible. Processing large images concurrently easily causes memory warnings and crashes.

Synchronous API can be found in the UIImageSBSDK class extension.

The asynchronous API is implemented as static class SBSDKImageProcessor. Additionally to the three standard operations SBSDKImageProcessor provides a method to apply custom image processing by specifying an SBSDKImageProcessingHandler block. Execution is also dispatched to the image processing queue. The operations completion handlers are called in main thread.

Each call into the asynchronous API returns a SBSDKProgress object to you. This NSProgress subclass can be used to observe the progress of the operation but also it can be used to cancel the operation via the -(void)cancel method.

Example code for custom asynchronous image filter

Objective-C:

// Specify the file URL for the input image
NSURL *inputImageURL = [NSURL URLWithString:@"..."];

// Specify the file URL the output image is written to. Set to nil, if you don't want to save the output image
NSURL *outputImageURL = [NSURL URLWithString:@"..."];

// Create the image processing block
SBSDKImageProcessingHandler processingHandler = ^UIImage *(UIImage *sourceImage, NSError **outError) {
    // Apply a color filter to the input image,
    UIImage *filteredImage = [sourceImage sbsdk_imageFilteredByFilter:SBSDKImageFilterTypeColor];

    // and return the filtered image.
    return filteredImage;
};

// Call the asynchronous image processing API. The returned progress object can be used to cancel the operation.
// Once the operation has been completed extract the image from resultInfo dictionary and do whatever you want with the image.
SBSDKProgress *progress = [SBSDKImageProcessor customFilterImage:inputImageURL
                                                 processingBlock:processingHandler
                                                  outputImageURL:outputImageURL
                                                      completion:^(BOOL finished,
                                                                   NSError *error,
                                                                   NSDictionary *resultInfo) {
    UIImage *outputImage = resultInfo[SBSDKResultInfoDestinationImageKey];
}];

Swift:

// Specify the file URL for the input image
guard let inputImageURL = URL(string: "...") else { return }

// Specify the file URL the output image is written to. Set to nil, if you don't want to save the output image
let outputImageURL = URL(string: "...")

// Create the image processing closure
let processingHandler: SBSDKImageProcessingHandler = { sourceImage, outError in
    // Apply a color filter to the input image,
    let filteredImage = sourceImage.sbsdk_imageFiltered(by: SBSDKImageFilterTypeColor)

    // and return the filtered image.
    return filteredImage
}
let progress = SBSDKImageProcessor.customFilterImage(inputImageURL,
                                                     processingBlock: processingHandler,
                                                     outputImageURL: outputImageURL) { isFinished, error, resultInfo in
    let outputImage = resultInfo?[SBSDKResultInfoDestinationImageKey] as? UIImage
}

Example code for detecting and applying a polygon to an image

Objective-C:

// Specify the input image URL.
NSURL *inputImageURL = [NSURL URLWithString:@"..."];

// Specify the output image URL. Set to nil, if you don't want to save the output image.
NSURL *outputImageURL = [NSURL URLWithString:@"..."];

// Create a document detector.
SBSDKDocumentDetector *detector = [[SBSDKDocumentDetector alloc] init];

// Let the document detector run on our input image.
SBSDKDocumentDetectorResult *result = [detector detectDocumentPolygonOnImage:[UIImage imageWithContentsOfFile:inputImageURL.path]
                                                            visibleImageRect:CGRectZero
                                                            smoothingEnabled:NO
                                                  useLiveDetectionParameters:NO];

// Check the result.
if (result.status == SBSDKDocumentDetectionStatusOK && result.polygon != nil) {

    // If the result is an acceptable polygon, we warp the image into the polygon asynchronously.
    // When warping is done we check the result and on success we pick up the output image.
    // Then do whatever you want with the warped image.
    [SBSDKImageProcessor warpImage:inputImageURL
                           polygon:result.polygon
                    outputImageURL:outputImageURL
                        completion:^(BOOL finished, NSError *error, NSDictionary *resultInfo) {
        if (finished && !error) {
            let outputImage = resultInfo?[SBSDKResultInfoDestinationImageKey] as? UIImage
        }
    }];
} else {
    // No acceptable polygon found.
}

Swift:

// Specify the file URL for the input image
guard let inputImageURL = URL(string: "...") else { return }

// Load image from the specified path
guard let inputImage = UIImage(contentsOfFile: inputImageURL.path) else { return }

// Specify the file URL the output image is written to. Set to nil, if you don't want to save the output image
let outputImageURL = URL(string: "...")

// Create a document detector.
let detector = SBSDKDocumentDetector()

// Let the document detector run on our input image.
let result = detector.detectDocumentPolygon(on: inputImage,
                                            visibleImageRect: .zero,
                                            smoothingEnabled: false,
                                            useLiveDetectionParameters: false)

// Check the result.
if result.status == SBSDKDocumentDetectionStatusOK, let polygon = result.polygon {

    // If the result is an acceptable polygon, we warp the image into the polygon asynchronously.
    // When warping is done we check the result and on success we pick up the output image.
    // Then do whatever you want with the warped image.
    SBSDKImageProcessor.warpImage(inputImageURL,
                                  polygon: polygon,
                                  outputImageURL: outputImageURL) { isFinished, error, resultInfo in
        if isFinished && error == nil {
            let outputImage = resultInfo?[SBSDKResultInfoDestinationImageKey] as? UIImage
        }
    }
} else {
    // No acceptable polygon found.
}

4. PDF Creation

The SBSDKPDFRenderer static class takes an image storage and renders them into a PDF. For each image a page is generated. The generated pages have sizes that correspond to DIN A4, US Letter or Custom. As the images are embedded unscaled the resolution for each page depends on its image. When rendering into a DIN A4 or US Letter format the orientation of the page; landscape or portrait; is derived from the image’s aspect ratio.

PDFs can be encrypted using SBSDKAESEncrypter or your custom written encryption classes. The PDF’s data is encrypted in memory before it is written to disk. To decrpyt the PDF you need to run proper decryption in your backend or clients.

NOTE: The ScanbotSDK does not lock the PDF with the password, but rather encrypts the actual file. This provides the best level of protection. To decrypt the PDF file you can use the key property of SBSDKAESEncrypter or generate the key yourself using salt, password and iterations.

See SBSDKPDFRendererPageSize for further information.

The operations completion handler is called in main thread.

Example code for creating a standard PDF from an image storage

Objective-C:

// Create an image storage to save the captured document images to
NSURL *imagesURL = [[SBSDKStorageLocation applicationDocumentsFolderURL]
                       URLByAppendingPathComponent:@"Images"];
SBSDKStorageLocation *imagesLocation = [[SBSDKStorageLocation alloc] initWithBaseURL:imagesURL];
SBSDKIndexedImageStorage *imageStorage = [[SBSDKIndexedImageStorage alloc] initWithStorageLocation:imagesLocation];

// Define the indices of the images in the image storage you want to render to the PDF, e.g. the first 3.
// To include all images you can simply pass nil for the indexSet. The indexSet is validated internally.
// You don't need to take care if all indices are valid.
NSIndexSet *indexSet = [NSIndexSet indexSetWithIndexesInRange:NSMakeRange(0, 2)];

// Specify the file URL where the PDF will be saved to. Nil here makes no sense.
NSURL *outputPDFURL = @"outputPDF";

// In case you want to encrypt your PDF file, create encrypter using a password and an encryption mode.
SBSDKAESEncrypter *encrypter = [[SBSDKAESEncrypter alloc] initWithPassword:@"password_example#42"
                                                                      mode:SBSDKAESEncrypterModeAES256];     

// Enqueue the operation and store the SBSDKProgress to watch the progress or cancel the operation.
// After completion the PDF is stored at the URL specified in outputPDFURL.
// You can also extract the image store and the PDF URL from the resultInfo.

SBSDKProgress *progress =
[SBSDKPDFRenderer renderImageStorage:imageStorage
                    copyImageStorage:YES
                            indexSet:indexSet
                        withPageSize:SBSDKPDFRendererPageSizeAutoLocale
                           encrypter:encrypter
                              output:outputPDFURL
                   completionHandler:^(BOOL finished, NSError *error, NSDictionary *resultInfo) {
    if (finished && error != nil) {
        SBSDKIndexedImageStorage *completedImageStore = resultInfo[SBSDKResultInfoImageStorageKey];
        NSURL *completedPDFURL = resultInfo[SBSDKResultInfoDestinationFileURLKey];
    }
}];

Swift:

// Create an image storage to save the captured document images to
let imagesURL = SBSDKStorageLocation.applicationDocumentsFolderURL().appendingPathComponent("Images")
let imagesLocation = SBSDKStorageLocation.init(baseURL: imagesURL)
guard let imageStorage = SBSDKIndexedImageStorage(storageLocation: imagesLocation) else { return }

// Define the indices of the images in the image storage you want to render to the PDF, e.g. the first 3.
// To include all images you can simply pass nil for the indexSet. The indexSet is validated internally.
// You don't need to take care if all indices are valid.
let indexSet = IndexSet(integersIn: 0...2)

// Specify the file URL where the PDF will be saved to. Nil here makes no sense.
guard let outputPDFURL = URL(string: "outputPDF") else { return }

// In case you want to encrypt your PDF file, create encrypter using a password and an encryption mode.
let encrypter = SBSDKAESEncrypter(password: "password_example#42", mode: .AES256)

// Enqueue the operation and store the SBSDKProgress to watch the progress or cancel the operation.
// After completion the PDF is stored at the URL specified in outputPDFURL.
// You can also extract the image store and the PDF URL from the resultInfo.
let progress = SBSDKPDFRenderer.renderImageStorage(imageStorage,
                                                   copyImageStorage: true,
                                                   indexSet: indexSet,
                                                   with: .autoLocale,
                                                   encrypter: encrypter,
                                                   output: outputPDFURL) { isFinished, error, resultInfo in
    if isFinished && error == nil {
        let completedImageStore = resultInfo?[SBSDKResultInfoImageStorageKey] as? SBSDKIndexedImageStorage
        let completedPDFURL = resultInfo?[SBSDKResultInfoDestinationFileURLKey] as? URL

    }
}

5. Optical Character Recognition

The Scanbot OCR feature is based on the Tesseract OCR engine with some modifications and enhancements. The Scanbot SDK uses an optimized custom library of the Tesseract OCR under the hood and provides a convenient API.

For each desired OCR language a corresponding .traineddata file (aka. tessdata) must be installed in the optional resource bundle named ScanbotSDKOCRData.bundle. Also the special data file osd.traineddata is required and must be installed. It is used for orientation and script detection.

The ScanbotSDK.framework itself does not contain any OCR language files to keep the framework small in size. The optional bundle ScanbotSDKOCRData.bundle, provided in the ZIP archive of Scanbot SDK, contains the language files for English and German as well as the osd.traineddata as examples. You can replace or complete these language files as needed. Add this bundle to your project and make sure that it is copied along with your resources into your app.

Preconditions to achieve a good OCR result

Conditions while scanning

A perfect document for OCR is flat, straight, doesn’t show large shadows, folds, or any other objects that could distract it and is in the highest possible resolution. Our UI and algorithms do their best to help you meet these requirements. But as in photography, you can never fully get the image information back that was lost during the shot.

Languages

You can use multiple languages for OCR. But since the recognition of characters and words is a very complicated process, increasing the number of languages lowers the overall precision. With more languages, there are more results where the detected word could match. We suggest using as few languages as possible. Make sure that the language you’re trying to detect is supported by the SDK and added to the project.

Size and position

Put the document on a flat surface. Take the photo from straight above in parallel to the document to make sure that the perspective correction doesn’t need to fix much. The document should fill out the camera frame while still showing all of the text that needs to be recognized. This results in more pixels for each character that needs to be detected and hence, more detail. Skewed pages decrease the recognition quality.

Light and shadows

More ambient light is always better. The camera takes the shot at a lower ISO value, which results in less grainy photos. You should make sure that there are no visible shadows. If you have large shadows, it’s better to take the shot at an angle instead. That’s why we also don’t recommend to use the flashlight. From this low distance, it creates a light spot at the center of the document, which decreases the quality.

Focus

The document needs to be properly focused so that the characters are sharp and clear. The auto-focus of the camera works well if you meet the minimum required distance for the lens to be able to focus. Which usually starts at 5-10cm.

Typefaces

The OCR trained data is optimized for common serif and sans-serif font types. Decorative or script fonts decrease the quality of the detection a lot.

Implementing OCR

Download OCR files

You can find a list of all supported OCR languages and download links on this Tesseract wiki page.

⚠️️️ Please choose and download the proper version of the language data files:

OCR API

The SBSDKOpticalTextRecognizer takes one or more images and performs various text related operations on each of the images:

  • Page layout analysis
  • Text recognizing
  • Creation of searchable PDF documents with selectable text

The page layout analysis returns information about page orientation, an angle the image should be rotated to deskew it, the text writing direction or the text line order.

The text recognizing operations take either a collection of images (SBSDKImageStoring) and optionally create a PDF of it, or a single image. The single image operation also can take a rectangle describing which area of the image should be text-recognized. The results found in the completion handlers resultsDictionary contain information about the found text, where the text was found (boundingBox) and what kind of text it is (word, line, paragraph).

Examples

All SBSDKOpticalTextRecognizer operations run in a separate serial queue.

The operations completion handlers are called in main thread.

Example code for performing a page layout analysis:

Objective-C:

// The file URL of the image we want to analyse.
NSURL *imageURL = [NSURL URLWithString:@"..."];

// Start the page layout analysis and store the returned SBSDKProgress object. This object can be used to cancel
// the operation or to observe the progress. See NSProgress.
// In completion check if we finished without error and extract the analyzer result from the resultInfo dictionary.
// Now we can work with the result.
SBSDKProgress *progress =
[SBSDKOpticalTextRecognizer analyseImagePageLayout:imageURL
                                        completion:^(BOOL finished, NSError *error, NSDictionary *resultInfo) {
    if (finished && !error) {

        SBSDKPageAnalyzerResult *result = resultInfo[SBSDKResultInfoPageAnalyzerResultsKey];

        if (result.orientation != SBSDKPageOrientationUp) {

        }
    }
}];

Swift:

// The file URL of the image we want to analyse.
guard let imageURL = URL(string: "...") else { return }

// Start the page layout analysis and store the returned SBSDKProgress object. This object can be used to cancel
// the operation or to observe the progress. See NSProgress.
// In completion check if we finished without error and extract the analyzer result from the resultInfo dictionary.
let progress = SBSDKOpticalTextRecognizer.analyseImagePageLayout(imageURL) { isFinished, error, resultInfo in
    if isFinished && error == nil {
        if let result = resultInfo?[SBSDKResultInfoPageAnalyzerResultsKey] as? SBSDKPageAnalyzerResult {
            // Now we can work with the result.
        }
    }
}

Example code for performing text recognition on an image:

Objective-C:

// The file URL of the image we want to analyse.
NSURL *imageURL = [NSURL URLWithString:@"..."];

// Enqueue the text recognition operation.
// We limit detection to the center area of the image leaving margins of 25% on each side.
// Only use english language to be recognized.
// The returned SBSDKProgress object can be used to cancel the operation or observer the progress.
// Upon completion we extract the result from the resultsDictionary and log the whole recognized text.
// The we enumerate all words and log them to the console together with their confidence values and bounding boxes.
SBSDKProgress *progress =
[SBSDKOpticalTextRecognizer recognizeText:imageURL
                                rectangle:CGRectMake(0.25f, 0.25f, 0.5f, 0.5f)
                           languageString:@"eng"
                               completion:^(BOOL finished, NSError *error, NSDictionary *resultInfo) {

    SBSDKOCRResult *result = resultInfo[SBSDKResultInfoOCRResultsKey];
    NSLog(@"Recognized Text: %@", result.recognizedText);
    for (SBSDKOCRPage *page in result.pages) {
        for (SBSDKOCRResultBlock *word in page.words) {
            NSLog(@"Word: %@, Confidence: %0.0f, Box: %@",
                  word.text,
                  word.confidenceValue,
                  NSStringFromCGRect(word.boundingBox));
        }
    }
}];

Swift:

// The file URL of the image we want to analyse.
guard let imageURL = URL(string: "...") else { return }

// Enqueue the text recognition operation.
// We limit detection to the center area of the image leaving margins of 25% on each side.
// Only use english language to be recognized.
// The returned SBSDKProgress object can be used to cancel the operation or observer the progress.
// Upon completion we extract the result from the resultsDictionary and log the whole recognized text.
// The we enumerate all words and log them to the console together with their confidence values and bounding boxes.
let progress = SBSDKOpticalTextRecognizer.recognizeText(imageURL,
                                                        rectangle: CGRect(x: 0.25, y: 0.25, width: 0.5, height: 0.5),
                                                        languageString: "eng") { isFinished, error, resultInfo in
    if let result = resultInfo?[SBSDKResultInfoOCRResultsKey] as? SBSDKOCRResult {
        // Now we can work with the result.
    }
}

6. Payform Recognition

The SBSDKPayFormScanner class provides functionality to detect and recognize SEPA payforms. It extracts relevant information fields by performing optical text recognition on certain areas of the image, e.g. IBAN, BIC, receiver, amount of money and reference.

This module needs the german language package. See SBSDKOpticalTextRecognizer for language addition.

For performance reasons the scanner is divided into two parts: detection and recognition. The detection only tests whether the scanned image contains a payform or not. The recognizer performs the text extraction and fills the fields.

The common usage is to configure the iPhones camera for full HD video capturing and run the detection part on each incoming frame synchronously in the video capture queue. When the detector returns a positive result the recognizing part runs on the same full HD frame in the same video capture queue and returns the recognizers result.

Example code on how to detect and recognize payforms in the video delegate

Objective-C:

- (void)processSampleBuffer:(CMSampleBufferRef)sampleBuffer videoOrientation:(AVCaptureVideoOrientation)orientation {

    // Create an SBSDKPayFormScanner.
    // Note: Usually you store it in a property, for demo purposes we create a new one for each frame.
    SBSDKPayFormScanner *scanner = [[SBSDKPayFormScanner alloc] init];

    // Recognize a bank transfer form in the sample buffer.
    SBSDKPayFormRecognitionResult *recognitionResult = [scanner recognizeFromSampleBuffer:sampleBuffer
                                                                              orientation:orientation];

    // If we have successfuly recognized payform
    if (recognitionResult.recognitionSuccessful) {
        dispatch_async(dispatch_get_main_queue(), ^{
            // Present the recognition results alert on main thread.
        });
    }
}

Swift:

func processSampleBuffer(_ sampleBuffer: CMSampleBuffer, videoOrientation orientation: AVCaptureVideoOrientation) {

    // Create an SBSDKPayFormScanner.
    // Note: Usually you store it in a property, for demo purposes we create a new one for each frame.
    let scanner = SBSDKPayFormScanner()

    // Recognize a bank transfer form in the sample buffer.
    let recognitionResult = scanner.recognize(from: sampleBuffer, orientation: orientation)

    // If we have successfuly recognized payform
    if recognitionResult?.recognitionSuccessful == true {
        DispatchQueue.main.async {
            // Present the recognition results alert on main thread.
        }
    }
}

7. Barcode Scanner

The Scanbot SDK provides the ability to search and decode multiple types of barcodes. Recognition is performed on a still image as well as on a live video. The result is incapsulated in an array of SBSDKBarcodeScannerResult instances. The Scanbot SDK supports the following types of barcodes:

1D barcodes:

  • EAN_13
  • EAN_8
  • UPC_E
  • CODE_39
  • CODE_93
  • CODE_128
  • ITF (Interleaved 2 of 5)
  • MSI Plessey

2D barcodes:

  • QR_CODE
  • DATA_MATRIX
  • AZTEC
  • PDF_417

To provide better detection results Scanbot SDK supports an ability to accumulate multiple frames before running detection. In this case, the barcode scanner will return empty results until the frames have been called a given number of times. Then the barcode scanner will perform detection on the frame with the least amount of blur. This feature is intended for use with live detection.

Example of detecting barcodes on video sample buffer:

Objective-C:

@property (nonatomic, strong) SBSDKBarcodeScanner *scanner;

- (void)viewDidLoad {
    [super viewDidLoad];

    //Initializing barcode scanner with number of frames to accumulate before running detection on the sharpest one
    //and barcode types to limit detection results to.
    self.scanner = [[SBSDKBarcodeScanner alloc] initWithFrameAccumulator:1 types:[SBSDKBarcodeType commonTypes]];
}

- (void)processSampleBuffer:(CMSampleBufferRef)sampleBuffer videoOrientation:(AVCaptureVideoOrientation)orientation {
    //Perform detection on sample buffer 
    NSArray<SBSDKBarcodeScannerResult *> *results = [self.scanner detectBarCodesOnSampleBuffer:sampleBuffer
                                                                                   orientation:orientation];
    if (results.count > 0) {
        //handle results
    }
}

Swift:

var scanner: SBSDKBarcodeScanner?

override func viewDidLoad() {
    super.viewDidLoad()

    //Initializing barcode scanner with number of frames to accumulate before running detection on the sharpest one
    //and barcode types to limit detection results to.
    self.scanner = SBSDKBarcodeScanner(frameAccumulator: 1, types: SBSDKBarcodeType.commonTypes())
}

func processSampleBuffer(_ sampleBuffer: CMSampleBuffer, videoOrientation orientation: AVCaptureVideoOrientation) {
    //Perform detection on sample buffer 
    if let results = scanner?.detectBarCodes(on: sampleBuffer, orientation: orientation), !results.isEmpty {
        //handle results
    }
}

8. Data Scanner

The Scanbot SDK comes with separate scanners for many specific use cases. Use cases that are not covered by any of these specialized scanners you can tackle with the Data Scanner module. This modules main class is SBSDKGenericTextLineRecognizer. You can configure its behaviour using the SBSDKGenericTextLineRecognizerConfiguration class.

Within a user-defined rectangular area of interest in consecutive video frames the Data Scanner recognizes text (OCR). A customizable block lets you clean up the raw string by filtering it against unwanted characters and OCR noise.
Additionally you can validate the result using pattern-matching or another block.

The Data Scanner returns a SBSDKGenericTextLineRecognizerResult object when it recognized text. This result contains the cleaned-up string as well as a boolean flag that tells you whether the validation was successful or not.

Use cases for the Data Scanner module are the recognition of single-line text like IBAN numbers, insurance numbers, dates and other textual data fields that can be easily validated resp. pattern-matched.

How is the Data Scanner different to regular OCR? In short, it is more reliable and robust, with a higher confidence in text recognition because it accumulates the results of multiple video frames as well as your input from the raw text clean up block.

To make the integration painless for you, Scanbot SDK provides a simple-to-use plugin-viewcontroller named SBSDKGenericTextLineRecognizerViewController that takes over the camera handling, displays the area-of-interest and runs the Data Scanner. The scanners results are passed to a delegate.

9. Image Storage

The ScanbotSDK comes with built-in image storage. You can store scanned images using either keyed or indexed storage. We provide convenient interfaces for this via SBSDKKeyedImageStorage and SBSDKIndexedImageStorage classes.

SBSDKKeyedImageStorage is a simple thread-safe multiple-reader-single-writer key-value fashioned disk image cache class. Manages images in a dictionary-like fashion.

SBSDKIndexedImageStorage is a simple thread-safe multiple-reader-single-writer index based disk image cache class. Manages images in an array-like fashion.

Both classes support PNG and JPEG image file formats represented in SBSDKImageFileFormat Enumeration.

Both classes support encryption, so your data is stored securely. We provide built-in support for AES128 and AES256 encryption. If these algorithms do not meet your requirements you can create your own encrypter by implementing a class conforming to the protocol SBSDKStorageCrypting.

For easier access to device’s file system ScanbotSDK provides convenient helper class SBSDKStorageLocation.

Example of creating indexed image storage using SBSDKStorageLocation and built-in AES256 encrypter:

Objective-C:

//Create Url to directory where you plan to store images
NSURL *imagesDirectoryURL = [[SBSDKStorageLocation applicationDocumentsFolderURL]
                                URLByAppendingPathComponent:@"Images"];

// Create storage location using Url to images directory          
SBSDKStorageLocation *storageLocation = [[SBSDKStorageLocation alloc] initWithBaseURL:imagesDirectoryURL];

//Create encrypter with password and desired encryption mode
SBSDKAESEncrypter *encrypter = [[SBSDKAESEncrypter alloc] initWithPassword:@"password_example#42"
                                                                      mode:SBSDKAESEncrypterModeAES256];

//Create indexed image storage using storage location, image file format and encrypter                         
SBSDKIndexedImageStorage *imageStorage = [[SBSDKIndexedImageStorage alloc] initWithStorageLocation:storageLocation
                                                                                        fileFormat:SBSDKImageFileFormatJPEG
                                                                                     withEncrypter:encrypter];

Swift:

//Create Url to directory where you plan to store images
let imagesDirectoryURL = SBSDKStorageLocation.applicationDocumentsFolderURL().appendingPathComponent("Images")

// Create storage location using Url to images directory
let storageLocation = SBSDKStorageLocation.init(baseURL: imagesDirectoryURL)

//Create encrypter with password and desired encryption mode
let encrypter = SBSDKAESEncrypter(password: "password_example#42", mode: .AES256)

//Create indexed image storage using storage location, image file format and encrypter 
let imageStorage = SBSDKIndexedImageStorage(storageLocation: storageLocation,
                                            fileFormat: .JPEG,
                                            withEncrypter: encrypter)

Examples of some basic operations on Indexed Image storage:

Objective-C:

//In the example we are using temporary image storage
SBSDKIndexedImageStorage *imageStorage = [SBSDKIndexedImageStorage temporaryStorage];

//Create image
UIImage *image = [UIImage imageNamed:@"testImage"];

//Create index
NSInteger index = 0;

//Get image at index
UIImage *storedImage = [imageStorage imageAtIndex:index];

//Add image to storage
BOOL isAdded = [imageStorage addImage:image];

//Insert image at index
BOOL isInserted = [imageStorage insertImage:image atIndex:index];

//Remove image at index
[imageStorage removeImageAtIndex:index];

//Move image from index to index
//Create new index to move image to
NSInteger newIndex = 1;
BOOL isMoved = [imageStorage moveImageFromIndex:index toIndex:newIndex];

//Replace image at index with another image
//Create image to replace stored one
UIImage *newImage = [UIImage imageNamed:@"newTestImage"];
BOOL isReplaced = [imageStorage replaceImageAtIndex:index withImage:newImage];

Swift:

//In the example we are using temporary image storage
guard let imageStorage = SBSDKIndexedImageStorage.temporary() else { return }

//Create image
guard let image = UIImage(named: "testImage") else { return }

//Create index
let index: UInt = 0

//Get image at index
let storedImage = imageStorage.image(at: index)

//Add image to storage
let isAdded = imageStorage.add(image)

//Insert image at index
let isInserted = imageStorage.insert(image, at: index)

//Remove image at index
imageStorage.removeImage(at: index)

//Move image from index to index
//Create new index to move image to
let newIndex: UInt = 1
let isMoved = imageStorage.moveImage(from: index, to: newIndex)

//Replace image at index with another image
//Create image to replace stored one
guard let newImage = UIImage(named: "newTestImage") else { return }
let isReplaced = imageStorage.replaceImage(at: index, with: newImage)

Examples of some basic operations on Keyed Image storage:

Objective-C:

//Create image storage using storage location, in example we are using temporary location
SBSDKKeyedImageStorage *imageStorage = [[SBSDKKeyedImageStorage alloc] initWithStorageLocation:[SBSDKStorageLocation temporaryLocation]];

//Create image
UIImage *image = [UIImage imageNamed:@"testImage"];

//Create key
NSString *key = @"testKey";

//Set image for key
[imageStorage setImage:image forKey:key];

//Get image for key
UIImage *storedImage = [imageStorage imageForKey:key];

//Remove image for key
[imageStorage removeImageForKey:key];

//Remove images for keys matching prefix
//Create prefix
NSString *prefix = @"test";
[imageStorage removeImagesForKeysMatchingPrefix:prefix];

Swift:

//Create image storage using storage location, in example we are using temporary location
let imageStorage = SBSDKKeyedImageStorage()

//Create image
guard let image = UIImage(named: "testImage") else { return }

//Create key
let key = "testKey"

//Set image for key
imageStorage.setImage(image, forKey: key)

//Get image for key
let storedImage = imageStorage.image(forKey: key)

//Remove image for key
imageStorage.removeImage(forKey: key)

//Remove images for keys matching prefix
//Create prefix
let prefix = "test"
imageStorage.removeImagesForKeys(matchingPrefix: prefix)

10. Generic Document Recognizer

The Scanbot SDK provides the ability to detect various types of documents on the image, crop them and recognize fields data via Generic Document Recognizer.

Currently, Generic Document Recognizer supports the following types of documents:

  • German ID Card
  • German Passport
  • German Driver’s license

As a result of scanning, the user gets the SBSDKGenericDocument object, if the result of the scanning was successful. SBSDKGenericDocument is a hierarchical structured type, that contains the document’s type, eventually a list of child documents, a total recognition confidence and a list of the documents fields. Each field is represented by the SBSDKGenericDocumentField class, holding the field’s type, a cropped image of the field, the recognized text and the field’s recognition confidence.

For convenience and user interface related tasks the SBSDKGenericDocument can either be flattened, using the functions -[SBSDKGenericDocument flatDocumentIncludingEmptyChildren:includingEmptyFields:] and -[SBSDKGenericDocument allFieldsIncludingEmptyFields:], or wrapped into a document’s type specific subclass of SBSDKGenericDocumentWrapper using the function -[SBSDKGenericDocument wrap]. Currently the following document wrappers exist:

There are two ways to integrate the component into the application:

Usage of the Classical UI component

The main class of the classical UI component is SBSDKGenericDocumentRecognizerViewController.

Usually this viewcontroller is embedded as a child viewcontroller into another viewcontroller, the parent viewcontroller. The parent viewcontroller usually acts as the delegate and processes the recognition results. You still have full control over the UI elements and can add additional views and buttons to your viewcontroller. The classical component does not display results, instead it just forwards them to the delegate.

Objective-C:

//
//  GenericDocumentObjcViewController.m
//  ScanbotSDK Examples
//
//  Created by Sebastian Husche on 06.05.21.
//

#import "GenericDocumentObjcViewController.h"
@import ScanbotSDK;

// This is a simple empty viewcontroller which acts as a container and delegate for the SBSDKGenericDocumentRecognizerViewController.
@interface GenericDocumentObjcViewController () <SBSDKGenericDocumentRecognizerViewControllerDelegate>

// The instance of the recognition view controller.
@property (nonatomic, strong) SBSDKGenericDocumentRecognizerViewController* recognizerController;

@end


@implementation GenericDocumentObjcViewController

- (void)viewDidLoad {
    [super viewDidLoad];

    // Define the types of documents that should be recognized.
    // Recognize all supported document types.
    NSArray *allTypes = [SBSDKGenericDocumentRootType allDocumentTypes];

    // Recognize German ID cards only. Front and/or back side.
    NSArray *idCardTypes = @[
        [SBSDKGenericDocumentRootType deIdCardFront],
        [SBSDKGenericDocumentRootType deIdCardBack]
    ];

    // Recognize German passports. Front side only.
    NSArray *passportTypes = @[
        [SBSDKGenericDocumentRootType dePassport]
    ];

    // Recognize German driver's licenses only. Front and/or back side.
    NSArray *driverLicenseTypes = @[
        [SBSDKGenericDocumentRootType deDriverLicenseFront],
        [SBSDKGenericDocumentRootType deDriverLicenseBack]
    ];

    // Create the SBSDKGenericDocumentRecognizerViewController instance
    // and let it embed into this viewcontrollers view.
    self.recognizerController
    = [[SBSDKGenericDocumentRecognizerViewController alloc] initWithParentViewController:self
       //Embed the recognizer in this viewcontrollers view.
                                                                              parentView:self.view
       // Pass the above types here as required.
                                                                   acceptedDocumentTypes:allTypes
       // Set the delegate to this viewcontroller.
                                                                                delegate:self];


    // Do additional configuration of the the recognizer viewcontroller.

    // E.g. turn on/off camera light on start.
    self.recognizerController.flashLightEnabled = NO;

    // Turn on/off the viewfinder.
    self.recognizerController.showViewFinder = YES;

    // Configure the viewfinder colors.
    self.recognizerController.viewFinderLineColor = [UIColor redColor];
    self.recognizerController.viewFinderBackgroundColor = [[UIColor redColor] colorWithAlphaComponent:0.1f];
}

// The delegate implementation of SBSDKGenericDocumentViewController.
- (void)documentRecognizerViewController:(nonnull SBSDKGenericDocumentRecognizerViewController *)controller
                    didRecognizeDocument:(nonnull SBSDKGenericDocument *)document
                                 onImage:(nonnull UIImage *)image {

    // Access the documents fields directly by iterating over the documents fields.
    for (SBSDKGenericDocumentField *field in document.fields) {
        //Print field type name, field text and field confidence to the console.
        NSLog(@"%@ = %@ (Confidence: %0.3f)", field.type.name, field.value.text, field.value.confidence);
    }

    // Or get a field by it's name.
    SBSDKGenericDocumentField *nameField = [document fieldByTypeName:@"Surname"];
    if (nameField != nil) {
        // Access various properties of the field.
        NSString *fieldTypeName = nameField.type.name;
        NSString* fieldValue = nameField.value.text;
        float confidence = nameField.value.confidence;
    }

    // Or create a wrapper for the document if needed.
    // You must cast it to the specific wrapper subclass.
    SBSDKGenericDocumentWrapper *wrapper = [document wrap];
    // Check the subclass...
    if ([wrapper isKindOfClass:[SBSDKGenericDocumentDeIdCardFront class]]) {
        // ... and cast it.
        SBSDKGenericDocumentDeIdCardFront* frontSideWrapper = (SBSDKGenericDocumentDeIdCardFront*)wrapper;
        // Access the documents fields easily through the wrapper.
        NSString *fieldTypeName = frontSideWrapper.surname.type.name;
        NSString* fieldValue = frontSideWrapper.surname.value.text;
        float confidence = frontSideWrapper.surname.value.confidence;
    }
}

@end

Swift:

import UIKit
import ScanbotSDK

// This is a simple empty viewcontroller which acts as a container and delegate for the SBSDKGenericDocumentRecognizerViewController.
class GenericDocumentViewController: UIViewController, SBSDKGenericDocumentRecognizerViewControllerDelegate {

    // The instance of the recognition view controller.
    var recognizerController: SBSDKGenericDocumentRecognizerViewController!

    override func viewDidLoad() {
        super.viewDidLoad()

        // Define the types of documents that should be recognized.
        // Recognize all supported document types.
        let allTypes: [SBSDKGenericDocumentRootType] = SBSDKGenericDocumentRootType.allDocumentTypes()

        // Recognize German ID cards only. Front and/or back side.
        let idCardTypes: [SBSDKGenericDocumentRootType] = [
            SBSDKGenericDocumentRootType.deIdCardFront(),
            SBSDKGenericDocumentRootType.deIdCardBack()
        ]

        // Recognize German passports. Front side only.
        let passportTypes: [SBSDKGenericDocumentRootType] = [
            SBSDKGenericDocumentRootType.dePassport()
        ]

        // Recognize German driver's licenses only. Front and/or back side.
        let driverLicenseTypes: [SBSDKGenericDocumentRootType] = [
            SBSDKGenericDocumentRootType.deDriverLicenseFront(),
            SBSDKGenericDocumentRootType.deDriverLicenseBack()
        ];

        // Create the SBSDKGenericDocumentRecognizerViewController instance
        // and let it embed into this viewcontrollers view.
        self.recognizerController
            = SBSDKGenericDocumentRecognizerViewController(parentViewController: self,
                                                           //Embed the recognizer in this viewcontrollers view.
                                                           parentView: self.view,
                                                           // Pass the above types here as required.
                                                           acceptedDocumentTypes: allTypes,
                                                           // Set the delegate to this viewcontroller.
                                                           delegate: self)


        // Do additional configuration of the the recognizer viewcontroller.

        // E.g. turn on/off camera light on start.
        self.recognizerController.flashLightEnabled = false

        // Turn on/off the viewfinder.
        self.recognizerController.showViewFinder = true

        // Configure the viewfinder colors.
        self.recognizerController.viewFinderLineColor = UIColor.red
        self.recognizerController.viewFinderBackgroundColor = UIColor.red.withAlphaComponent(0.1)
    }


    // The delegate implementation of SBSDKGenericDocumentViewController.
    func documentRecognizerViewController(_ controller: SBSDKGenericDocumentRecognizerViewController,
                                          didRecognizeDocument document: SBSDKGenericDocument,
                                          on image: UIImage) {

        // Access the documents fields directly by iterating over the documents fields.
        for field in document.fields {
            //Print field type name, field text and field confidence to the console.
            print("\(field.type.name) = \(field.value?.text ?? "") (Confidence: \(field.value?.confidence ?? 0.0)")
        }


        // Or get a field by it's name.
        if let nameField = document.field(byTypeName: "Surname") {
            // Access various properties of the field.
            let fieldTypeName = nameField.type.name
            let fieldValue = nameField.value?.text
            let confidence = nameField.value?.confidence
        }


        // Or create a wrapper for the document if needed.
        // You must cast it to the specific wrapper subclass.
        if let wrapper = document.wrap() as? SBSDKGenericDocumentDeIdCardFront {
            // Access the documents fields easily through the wrapper.
            let fieldTypeName = wrapper.surname.type.name
            let fieldValue = wrapper.surname.value?.text
            let confidence = wrapper.surname.value?.confidence
        }
    }
}

Usage of the Ready to use UI component

The main class of the classical UI component is SBSDKUIGenericDocumentRecognizerViewController.

Usually this viewcontroller is used as a separate screen for recognizing generic documents. It displays the recognition results in an expandable tableview while the recognizer continues to recognize the document to further improve the result. SBSDKUIGenericDocumentRecognizerViewController even allows you to scan two sides of a document, e.g. an ID card with front and back side, in a single screen. Once happy with the results, press the submit button and the recognizer viewcontroller is dismissed and passes the results to it’s delegate.

While you don’t have direct control of the actual recognition viewcontroller you can use the
SBSDKUIGenericDocumentRecognizerConfiguration to customize it in a wide variety, such as colors, texts and behavior.

Objective-C:

#import "GenericDocumentUIObjcViewController.h"
@import ScanbotSDK;

@interface GenericDocumentUIObjcViewController () <SBSDKUIGenericDocumentRecognizerViewControllerDelegate>

@end

@implementation GenericDocumentUIObjcViewController

- (void)viewDidAppear:(BOOL)animated {
    [super viewDidAppear:animated];
    // Start scanning here. Usually this is an action triggered by some button or menu.
    [self startScanning];
}

- (void)startScanning {

    // Create the default configuration object.
    SBSDKUIGenericDocumentRecognizerConfiguration *configuration
    = [SBSDKUIGenericDocumentRecognizerConfiguration defaultConfiguration];

    // And customize behaviour, user interface and text.

    //Behaviour configuration:
    // Select one of the following document types:
    // German ID card. Front and/or backside.
    configuration.behaviorConfiguration.documentType = [SBSDKUIDocumentType idCardFrontBackDE];

    // Or German driver's license. Front and/or backside.
    //configuration.behaviorConfiguration.documentType = [SBSDKUIDocumentType driverLicenseFrontBackDE];

    // Or German passport. Single sided.
    //configuration.behaviorConfiguration.documentType = [SBSDKUIDocumentType passportDE];

    // E.g. turn on/off camera light on start.
    configuration.behaviorConfiguration.flashEnabled = NO;


    //User interface configuration:
    //Configure various colors.
    configuration.uiConfiguration.detailsBackgroundColor = [UIColor darkGrayColor];
    configuration.uiConfiguration.detailsSectionHeaderBackgroundColor = [UIColor darkGrayColor];

    // Customize the visibility of certain fields in the recognized fields list.
    //Print the field type visibilities, if needed.
    // print("\(configuration.uiConfiguration.fieldTypeVisibilities)")

    // Always show the eye-color field in the recognized fields list.
    configuration.uiConfiguration.fieldTypeVisibilities[@"DeIdCardBack.EyeColor"]
    = SBSDKGenericDocumentFieldDisplayStateAlwaysVisible;
    // Show the categories field in the recognized fields list if the field has a value. Otherwise it is hidden.
    configuration.uiConfiguration.fieldTypeVisibilities[@"DeDriverLicenseFront.LicenseCategories"]
    = SBSDKGenericDocumentFieldDisplayStateVisibleIfNotEmpty;


    //Text configuration:
    // Customize UI element's texts.
    configuration.textConfiguration.cancelButtonTitle = @"Abort";
    configuration.textConfiguration.clearButtonTitle = @"Reset";

    // Customize document type and field type names. Used also for internationalisation.
    // Print the document type texts if needed.
    //NSLog(@"%@", configuration.textConfiguration.documentTypeDisplayTexts);

    // Change/localize the display text for the front side of a German ID card.
    configuration.textConfiguration.documentTypeDisplayTexts[@"DeIdCardFront"] = @"Personalausweis (Vorderseite)";

    // Print the field type texts if needed.
    //NSLog(@"%@", configuration.textConfiguration.fieldTypeDisplayTexts);
    // Change/localize the display text for the surname field on the front side of a German ID card.
    configuration.textConfiguration.fieldTypeDisplayTexts[@"DeDriverLicenseFront.Surname"] = @"Nachname";

    // Present the recognizer view controller modally on this viewcontroller.
    [SBSDKUIGenericDocumentRecognizerViewController presentOn:self configuration:configuration andDelegate:self];
}

// The delegate function implementation.
- (void)genericDocumentRecognizerViewController:(nonnull SBSDKUIGenericDocumentRecognizerViewController *)viewController
                         didFinishWithDocuments:(nonnull NSArray<SBSDKGenericDocument *> *)documents {

    // Get the first document. In case of multiple documents, e.g. front side and back side you need to
    // handle all of them.
    SBSDKGenericDocument *document = documents.firstObject;
    if (document == nil) {
        return;
    }

    // Access the documents fields directly by iterating over the documents fields.
    for (SBSDKGenericDocumentField *field in document.fields) {
        //Print field type name, field text and field confidence to the console.
        NSLog(@"%@ = %@ (Confidence: %0.3f)", field.type.name, field.value.text, field.value.confidence);
    }

    // Or get a field by it's name.
    SBSDKGenericDocumentField *nameField = [document fieldByTypeName:@"Surname"];
    if (nameField != nil) {
        // Access various properties of the field.
        NSString *fieldTypeName = nameField.type.name;
        NSString* fieldValue = nameField.value.text;
        float confidence = nameField.value.confidence;
    }

    // Or create a wrapper for the document if needed.
    // You must cast it to the specific wrapper subclass.
    SBSDKGenericDocumentWrapper *wrapper = [document wrap];
    // Check the subclass...
    if ([wrapper isKindOfClass:[SBSDKGenericDocumentDeIdCardFront class]]) {
        // ... and cast it.
        SBSDKGenericDocumentDeIdCardFront* frontSideWrapper = (SBSDKGenericDocumentDeIdCardFront*)wrapper;
        // Access the documents fields easily through the wrapper.
        NSString *fieldTypeName = frontSideWrapper.surname.type.name;
        NSString* fieldValue = frontSideWrapper.surname.value.text;
        float confidence = frontSideWrapper.surname.value.confidence;
    }
}

@end

Swift:

import UIKit
import ScanbotSDK

// The view controller that presents the document recognizer screen.
class GenericDocumentUIViewController: UIViewController, SBSDKUIGenericDocumentRecognizerViewControllerDelegate {

    override func viewDidAppear(_ animated: Bool) {
        super.viewDidAppear(animated)
        // Start scanning here. Usually this is an action triggered by some button or menu.
        self.startScanning()
    }

    func startScanning() {

        // Create the default configuration object.
        let configuration = SBSDKUIGenericDocumentRecognizerConfiguration.default()

        // And customize behaviour, user interface and text.

        //Behaviour configuration:
        // Select one of the following document types:
        // German ID card. Front and/or backside.
        configuration.behaviorConfiguration.documentType = SBSDKUIDocumentType.idCardFrontBackDE()

        // Or German driver's license. Front and/or backside.
        // configuration.behaviorConfiguration.documentType = SBSDKUIDocumentType.driverLicenseFrontBackDE()

        // Or German passport. Single sided.
        //configuration.behaviorConfiguration.documentType = SBSDKUIDocumentType.passportDE()

        // E.g. turn on/off camera light on start.
        configuration.behaviorConfiguration.isFlashEnabled = false


        //User interface configuration:
        //Configure various colors.
        configuration.uiConfiguration.detailsBackgroundColor = UIColor.darkGray
        configuration.uiConfiguration.detailsSectionHeaderBackgroundColor = UIColor.darkGray

        // Customize the visibility of certain fields in the recognized fields list.
        //Print the field type visibilities, if needed.
        // print("\(configuration.uiConfiguration.fieldTypeVisibilities)")

        // Always show the eye-color field in the recognized fields list.
        configuration.uiConfiguration.fieldTypeVisibilities["DeIdCardBack.EyeColor"]
            = SBSDKGenericDocumentFieldDisplayStateAlwaysVisible
        // Show the categories field in the recognized fields list if the field has a value. Otherwise it is hidden.
        configuration.uiConfiguration.fieldTypeVisibilities["DeDriverLicenseFront.LicenseCategories"]
            = SBSDKGenericDocumentFieldDisplayStateVisibleIfNotEmpty


        //Text configuration:
        // Customize UI element's texts.
        configuration.textConfiguration.cancelButtonTitle = "Abort"
        configuration.textConfiguration.clearButtonTitle = "Reset"

        // Customize document type and field type names. Used also for internationalisation.
        // Print the document type texts if needed.
        // print("\(configuration.textConfiguration.documentTypeDisplayTexts)")

        // Change/localize the display text for the front side of a German ID card.
        configuration.textConfiguration.documentTypeDisplayTexts["DeIdCardFront"] = "Personalausweis (Vorderseite)"

        // Print the field type texts if needed.
        // print("\(configuration.textConfiguration.fieldTypeDisplayTexts)")
        // Change/localize the display text for the surname field on the front side of a German ID card.
        configuration.textConfiguration.fieldTypeDisplayTexts["DeDriverLicenseFront.Surname"] = "Nachname"

        // Present the recognizer view controller modally on this viewcontroller.
        SBSDKUIGenericDocumentRecognizerViewController.present(on: self,
                                                               // Pass the configuration.
                                                               configuration: configuration,
                                                               //Set the delegate
                                                               andDelegate: self)
    }

    // The delegate function implementation.
    func genericDocumentRecognizerViewController(_ viewController: SBSDKUIGenericDocumentRecognizerViewController,
                                                 didFinishWith documents: [SBSDKGenericDocument]) {

        // Get the first document. In case of multiple documents, e.g. front side and back side you need to
        // handle all of them.
        guard let document = documents.first else {
            return
        }

        // Access the documents fields directly by iterating over the documents fields.
        for field in document.fields {
            //Print field type name, field text and field confidence to the console.
            print("\(field.type.name) = \(field.value?.text ?? "") (Confidence: \(field.value?.confidence ?? 0.0)")
        }


        // Or get a field by it's name.
        if let nameField = document.field(byTypeName: "Surname") {
            // Access various properties of the field.
            let fieldTypeName = nameField.type.name
            let fieldValue = nameField.value?.text
            let confidence = nameField.value?.confidence
        }


        // Or create a wrapper for the document if needed.
        // You must cast it to the specific wrapper subclass.
        if let wrapper = document.wrap() as? SBSDKGenericDocumentDeIdCardFront {
            // Access the documents fields easily through the wrapper.
            let fieldTypeName = wrapper.surname.type.name
            let fieldValue = wrapper.surname.value?.text
            let confidence = wrapper.surname.value?.confidence
        }
    }
}

11. NFC Passport Reader

The Scanbot SDK provides Near Field Communication(NFC) scanner for reading data from passport’s NFC chip. To use it you should follow these steps:

  • device you launch your app on should have iOS 13 or higher
  • your app needs to add “Near Field Communication Tag Reading”
  • your app’s info.plist needs to provide “Privacy - NFC Scan Usage Description”
  • your app’s info.plist needs to define the passport nfc application ID. To do so add the following entry: “com.apple.developer.nfc.readersession.iso7816.select-identifiers” and as the first element add the passport application ID “A0000002471001”

To check if NFC passport reading is available on the system you can call SBSDKNFCPassportReader‘s class method isPassportReadingAvailable.

Example of reading data from passport’s NFC chip using SBSDKNFCPassportReader:

Objective-C:

@interface MyViewController <SBSDKPassportReaderDelegate>

//Create progress view to show user scan processing
@property (strong, nonatomic) IBOutlet UIProgressView *progressView;

- (void)startNFCScanning {
    if (@available(iOS 13, *)) {

        //To create NFC reader instance you should provide passport number, birth date and expiration date.
        //This data you can get from passport's MRZ using our Machine readable zone(MRZ) recognizer.
        //For this example we will use default data
        SBSDKNFCPassportReader *nfcReader = [[SBSDKNFCPassportReader alloc] initWithPassportNumber:@""
                                                                                         birthDate:[NSDate date]
                                                                                    expirationDate:[NSDate date]
                                                                                    initialMessage:@"Hold your phone over the passport."
                                                                                          delegate:self];
        //Set the message being displayed on the NFC-scanning UI
        [nfcReader setMessage:@"Authenticating with passport..."];
    }
}

- (void)passportReaderDidConnect:(nonnull SBSDKNFCPassportReader *)reader {
    [reader setMessage:@"Enumerating available data groups..."];

    //Enumerate all available data group types on the passport's NFC chip and return them to the completion handler.
    [reader enumerateDatagroups:^(NSArray<SBSDKNFCDatagroupType *>* types) {
        [reader setMessage:@"Downloading data groups..."];

        //Download and parse the specified data groups
        [reader downloadDatagroupsOfType:types completion:^(NSArray<SBSDKNFCDatagroup *>* groups) {
            [reader setMessage:@"Finished downloading data groups."];
            if (groups.count > 0) {
                dispatch_async(dispatch_get_main_queue(), ^{
                    // Show result in a way you prefer
                });
            }
        }];
    }];
}

- (void)passportReader:(SBSDKNFCPassportReader *)reader didStartReadingGroup:(SBSDKNFCDatagroupType *)type {
    [reader setMessage:[NSString stringWithFormat:@"Downloading data group %@...", type]];
    self.progressView.progress = 0.0f;
    self.progressView.hidden = NO;
}

- (void)passportReader:(SBSDKNFCPassportReader *)reader didProgressReadingGroup:(float)progress {
    self.progressView.progress = progress;
    self.progressView.hidden = NO;
}

- (void)passportReader:(SBSDKNFCPassportReader *)reader didFinishReadingGroup:(SBSDKNFCDatagroupType *)type {
    [reader setMessage:[NSString stringWithFormat:@"Finished downloading data group %@...", type]];
    self.progressView.progress = 1.0f;
    self.progressView.hidden = YES;
}

- (void)passportReaderDidFinishSession:(SBSDKNFCPassportReader *)reader {
    self.progressView.hidden = YES;
}

@end

Swift:

class MyViewController: UIViewController {

    //Create progress view to show user scan processing
    @IBOutlet var progressView: UIProgressView?

    func startScanning() {
        if #available(iOS 13, *) {
            //To create NFC reader instance you should provide passport number, birth date and expiration date.
            //This data you can get from passport's MRZ using our Machine readable zone(MRZ) recognizer.
            //For this example we will use default data
            let nfcReader = SBSDKNFCPassportReader(passportNumber: "",
                                                   birthDate: Date(),
                                                   expirationDate: Date(),
                                                   initialMessage: "Hold your phone over the passport.",
                                                   delegate: self)
            nfcReader.setMessage("Authenticating with passport...")
        }
    }
}

extension MyViewController: SBSDKPassportReaderDelegate {
    func passportReaderDidConnect(_ reader: SBSDKNFCPassportReader) {
        reader.setMessage("Enumerating available data groups...")

        //Enumerate all available data group types on the passport's NFC chip and return them to the completion handler.
        reader.enumerateDatagroups { types in
            reader.setMessage("Downloading data groups...")

            //Download and parse the specified data groups
            reader.downloadDatagroups(ofType: types) { groups in
                reader.setMessage("Finished downloading data groups.")
                if !groups.isEmpty {
                    DispatchQueue.main.async {
                        // Show result in a way you prefer
                    }
                }
            }
        }
    }

    func passportReader(_ reader: SBSDKNFCPassportReader, didStartReadingGroup type: String) {
        reader.setMessage("Downloading data group \(type)...")
        self.progressView?.progress = 0.0
        self.progressView?.isHidden = false
    }

    func passportReader(_ reader: SBSDKNFCPassportReader, didProgressReadingGroup progress: Float) {
        self.progressView?.progress = progress
        self.progressView?.isHidden = false
    }

    func passportReader(_ reader: SBSDKNFCPassportReader, didFinishReadingGroup type: String) {
        reader.setMessage("Finished downloading data group \(type)...")
        self.progressView?.progress = 1.0
        self.progressView?.isHidden = true
    }

    func passportReaderDidFinishSession(_ reader: SBSDKNFCPassportReader) {
        self.progressView?.isHidden = true
    }
}

12. EU License Plate Scanner

The Scanbot SDK provides a scanner for a vehicle’s license plate and running validation on the result. Ideally you use an instance of this class on subsequent video frames.

Example of scanning license plates using SBSDKLicensePlateScanner:

Objective-C:

@property (nonatomic, strong) SBSDKLicensePlateScanner *scanner;
@property (nonatomic, assign) CGRect finderRectangle;

- (void)viewDidLoad {
    [super viewDidLoad];

    //Create a configuration for a license plate scanner
    SBSDKLicensePlateScannerConfiguration *configuration = [[SBSDKLicensePlateScannerConfiguration alloc] init];

    //Create a scanner instance using a configuration
    self.scanner = [[SBSDKLicensePlateScanner alloc] initWithConfiguration:configuration];

    //Create a finder rectangle
    //Note: For this example rectangle values are hardcoded, preferably use relative values
    self.finderRectangle = CGRectMake(0, 0, 300, (300/4));
}

- (void)processSampleBuffer:(CMSampleBufferRef)sampleBuffer videoOrientation:(AVCaptureVideoOrientation)orientation {

    //Create an image from sample buffer using Scanbot SDK helper methods
    UIImage *image = [UIImage sbsdk_imageFromSampleBuffer:sampleBuffer orientation:orientation];

    //Scan license plate from image in expected finder rectangle
    SBSDKLicensePlateScannerResult *result = [self.scanner scanVideoFrameImage:image inRect:self.finderRectangle];

    //If recognition was successful handle the result
    if (result.isValidationSuccessful) {
        //handle result
    }
}

Swift:

var scanner: SBSDKLicensePlateScanner?

//Create a finder rectangle
//Note: For this example rectangle values are hardcoded, preferably use relative values
var finderRectangle: CGRect = CGRect(x: 0, y: 0, width: 300, height: (300/4))

override func viewDidLoad() {
    super.viewDidLoad()
    //Create a configuration for a license plate scanner
    let configuration = SBSDKLicensePlateScannerConfiguration()

    //Create a scanner instance using a configuration
    self.scanner = SBSDKLicensePlateScanner(configuration: configuration)
}

func processSampleBuffer(_ sampleBuffer: CMSampleBuffer, videoOrientation orientation: AVCaptureVideoOrientation) {
    //Create an image from sample buffer using Scanbot SDK helper methods
    if let image = UIImage.sbsdk_image(from: sampleBuffer, orientation: orientation) {

        //Scan license plate from image in expected finder rectangle
        //If recognition was successful handle the result
        if let result = self.scanner?.scanVideoFrameImage(image, in: finderRectangle), result.isValidationSuccessful {
            //handle result
        }
    }
}