Modules in Detail

1. Document Detection

SBSDKDocumentDetector uses digital image processing algorithms to find rectangular, document like, polygons in a digital image.

- (SBSDKDocumentDetectorResult *)detectDocumentPolygonOnImage:(UIImage *)image

As input an UIImage object or a CMSampleBufferRef is accepted. Typically the camera API in Apples AVFoundation framework returns CMSampleBufferRef objects. This way you can easily detect document polygons on a live camera video stream or still image shot. Images from the photo library are usually converted to UIImage objects before being passed to the detector.

The visibleRect parameter lets you limit the detection area on the image. If you pass CGRectZero or an a rectangle with zero width or height the whole image is used for detection. The rectangle must be provided in unit coordinate system { 0.0f, 0.0f } - { 1.0f, 1.0f }, where { 0.0f, 0.0f } is the top left and { 1.0f, 1.0f } the bottom right corner of the image. The detector ignores edges that are outside the visibleRect.

The smooth parameter is typically used for realtime detection using the same detector instance all the time. When set to YES, consecutive detected document polygons within a certain timeframe are cumulated into a single dampened polygon. It prevents jumping polygons in situations where the recognized edges changes from video frame to video frame. If you use smoothing you should also observe the devices motion and clear the consecutive polygon buffer if significant motion was detected. This is done by calling the detectors -resetSmoothingData method.

The liveDetection flag you tpyically set to YES when you need the fastest detection possible for e.g. realtime detection. The results are a little bit less accurate than when set to NO but the performance significantly rises. With liveDetection enabled an iPhone 6 can detect up to 20 video frames per second, depending on the video frame resolution. It is recommended to turn off liveDetection if you want to detect only once on a static image.

The result contains an SBSDKPolygon (or nil if nothing was found) and an SBSDKDocumentDetectionStatus enum member. For user guidance during live detection the status can be used. It tells you whether the detected polygon is too small, the perspective is insufficient or the orientation is not optimal. If there was no polygon detected the status might contain information about why there was no polygon detected (too noisy background, too dark).

For manual realtime detection with the device’s camera also take a look at the following classes:

Example code for document detection on a video frame

- (void)processSampleBuffer:(CMSampleBufferRef)sampleBuffer
           videoOrientation:(AVCaptureVideoOrientation)orientation {

    // Create an SBSDKDocumentDetector.
    // Note: Usually you store it in a property, for demo purposes we create a new one for each frame.
    SBSDKDocumentDetector *detector = [[SBSDKDocumentDetector alloc] init];

    // Detect a documents outlines on the sample buffer
    SBSDKDocumentDetectorResult *result = [detector detectDocumentPolygonOnSampleBuffer:sampleBuffer

    // If we have an acceptable polygon...
    if (result.status == SBSDKDocumentDetectionStatusOK && result.polygon != nil) {
        // We take a still shot from the camera.
        // Create an UIImage from the still shot sample buffer.
        // Warp the image into the polygon.
        [self.cameraSession captureStillImageWithCompletionHandler:^(CVPixelBufferRef pixelBuffer, NSError *error) {
            UIImage *image = [UIImage sbsdk_imageFromPixelBuffer:pixelBuffer];
            image = [image sbsdk_imageWarpedByPolygon:result.polygon imageScale:1.0];

2. User Interface for guided, automatic Document Scanning

See SBSDKScannerViewController.

For your convenience Scanbot SDK comes with a view controller subclass that handles all the camera and detection implementation details for you. Additionally it provides UI for Scanbots document scanning guidance as well as the functionality and UI for manual and automatic shutter release.

The controllers delegate can customize the appearance and behavior of the guidance UI. Further SBSDKScannerViewController gives its delegate control over how and when frames are analyzed and, most important, it delivers the scanned (and perspective corrected, cropped document) images to its delegate.

See SBSDKScannerViewControllerDelegate for customization of UI and behavior.

Example code

#import "SBSDKScanbotSDK.h"

@interface DemoViewController : UIViewController

@interface DemoViewController() <SBSDKScannerViewControllerDelegate>
@property (strong, nonatomic) SBSDKIndexedImageStorage *imageStorage;
@property (strong, nonatomic) SBSDKScannerViewController *scannerViewController;
@property (assign, nonatomic) BOOL viewAppeared;

@implementation DemoViewController

- (void)viewDidLoad {
    [super viewDidLoad];

    // Create an image storage to save the captured document images to
    NSURL *imagesURL = [[SBSDKStorageLocation applicationDocumentsFolderURL]
    SBSDKStorageLocation *imagesLocation = [[SBSDKStorageLocation alloc] initWithBaseURL:imagesURL];
    self.imageStorage = [[SBSDKIndexedImageStorage alloc] initWithStorageLocation:imagesLocation];

    // Create the SBSDKScannerViewController.
    // We want it to be embedded into self.
    // As we do not want automatic image storage we pass nil here for the image storage.
    = [[SBSDKScannerViewController alloc] initWithParentViewController:self imageStorage:nil];

    // Set the delegate to self.
    self.scannerViewController.delegate = self;

    // We want unscaled images in full 5 or 8 MPixel size.
    self.scannerViewController.imageScale = 1.0f;

- (void)viewWillDisappear:(BOOL)animated {
    [super viewWillDisappear:animated];
    self.viewAppeared = NO;

- (void)viewDidAppear:(BOOL)animated {
    [super viewDidAppear:animated];
    self.viewAppeared = YES;

- (BOOL)shouldAutorotate {
    // No autorotations.
    return NO;

- (UIInterfaceOrientationMask)supportedInterfaceOrientations {
    // Only portrait.
    return UIInterfaceOrientationMaskPortrait;

- (UIStatusBarStyle)preferredStatusBarStyle {
    // White statusbar.
    return UIStatusBarStyleLightContent;

#pragma mark - SBSDKScannerViewControllerDelegate

- (BOOL)scannerControllerShouldAnalyseVideoFrame:(SBSDKScannerViewController *)controller {
    // We want to only process video frames when self is visible on screen and front most view controller
    return self.viewAppeared && self.presentedViewController == nil;

- (void)scannerController:(SBSDKScannerViewController *)controller
  didCaptureDocumentImage:(UIImage *)documentImage {
    // Here we get the perspective corrected and cropped document image after the shutter was (auto)released.
    // We store it into our image storage.
    [self.imageStorage addImage:documentImage];

- (void)scannerController:(SBSDKScannerViewController *)controller didCaptureImage:(UIImage *)image {
    // Here we get the full image from the camera. We could run another manual detection here or use the latest
    // detected polygon from the video stream to process the image with.

- (void)scannerController:(SBSDKScannerViewController *)controller
         didDetectPolygon:(SBSDKPolygon *)polygon
               withStatus:(SBSDKDocumentDetectionStatus)status {
    // Everytime the document detector finishes detection it calls this delegate method.


- (UIView *)scannerController:(SBSDKScannerViewController *)controller
       viewForDetectionStatus:(SBSDKDocumentDetectionStatus)status {

    // Here we can return a custom view that we want to use to visualize the latest detection status.
    // We return nil for now to use the standard label.
    return nil;

- (UIColor *)scannerController:(SBSDKScannerViewController *)controller
polygonColorForDetectionStatus:(SBSDKDocumentDetectionStatus)status {

    // If the detector has found an acceptable polygon we show it with green color
    if (status == SBSDKDocumentDetectionStatusOK) {
        return [UIColor greenColor];
    // Otherwise red colored.
    return [UIColor redColor];


See SBSDKScannerViewControllerDelegate for details.

3. Image Processing

See SBSDKImageProcessor.

Digital image processing is a core part of Scanbot SDK. Basically there are three operations on images:

  • Rotation
  • Image filtering
  • Image warping (perspective correction and cropping) into a 4-sided polygons shape

All of these image operations can be called either synchronously in any thread or queue or asynchronously on a special serial image processing queue. When working with large images it is highly recommended to make use of the asynchronous API as here no parallel processing of images is possible. Processing large images concurrently easily causes memory warnings and crashes.

Synchronous API can be found in the UIImageSBSDK class extension.

The asynchronous API is implemented as static class SBSDKImageProcessor. Additionally to the three standard operations SBSDKImageProcessor provides a method to apply custom image processing by specifying an SBSDKImageProcessingHandler block. Execution is also dispatched to the image processing queue. The operations completion handlers are called in main thread.

Each call into the asynchronous API returns a SBSDKProgress object to you. This NSProgress subclass can be used to observe the progress of the operation but also it can be used to cancel the operation via the -(void)cancel method.

Example code for custom asynchronous image filter

// Specify the file URL for the input image
NSURL *inputImageURL = [NSURL URLWithString:@"..."];

// Specify the file URL the output image is written to. Set to nil, if you don't want to save the output image
NSURL *outputImageURL = [NSURL URLWithString:@"..."];

// Create the image processing block
SBSDKImageProcessingHandler processingHandler = ^UIImage *(UIImage *sourceImage, NSError **outError) {
    // Apply a color filter to the input image,
    UIImage *filteredImage = [sourceImage sbsdk_imageFilteredByFilter:SBSDKImageFilterTypeColor];

    // and return the filtered image.
    return filteredImage;

// Call the asynchronous image processing API. The returned progress object can be used to cancel the operation.
// Once the operation has been completed extract the image from resultInfo dictionary and do whatever you want with the image.
SBSDKProgress *progress = [SBSDKImageProcessor customFilterImage:inputImageURL
                                                      completion:^(BOOL finished,
                                                                   NSError *error,
                                                                   NSDictionary *resultInfo) {
    UIImage *outputImage = resultInfo[SBSDKResultInfoDestinationImageKey];

Example code for detecting and applying a polygon to an image

// Specify the input image URL.
NSURL *inputImageURL = [NSURL URLWithString:@"..."];

// Specify the output image URL. Set to nil, if you don't want to save the output image.
NSURL *outputImageURL = [NSURL URLWithString:@"..."];

// Create a document detector.
SBSDKDocumentDetector *detector = [[SBSDKDocumentDetector alloc] init];

// Let the document detector run on our input image.
SBSDKDocumentDetectorResult *result = [detector detectDocumentPolygonOnImage:[UIImage imageWithContentsOfFile:inputImageURL.path]

// Check the result.
if (result.status == SBSDKDocumentDetectionStatusOK && result.polygon != nil) {

    // If the result is an acceptable polygon, we warp the image into the polygon asynchronously.
    // When warping is done we check the result and on success we pick up the output image.
    // Then do whatever you want with the warped image.
    [SBSDKImageProcessor warpImage:inputImageURL
                        completion:^(BOOL finished, NSError *error, NSDictionary *resultInfo) {
        if (finished && !error) {
            UIImage *outputImage = resultInfo[SBSDKResultInfoDestinationImageKey];
} else {
    // No acceptable polygon found.

4. PDF Creation

The SBSDKPDFRenderer static class takes an image storage and renders them into a PDF. For each image a page is generated. The generated pages have sizes that correspond to DIN A4, US Letter or Custom. As the images are embedded unscaled the resolution for each page depends on its image. When rendering into a DIN A4 or US Letter format the orientation of the page; landscape or portrait; is derived from the images aspect ratio.

See SBSDKPDFRendererPageSize for further information.

The operations completion handler is called in main thread.

Example code for creating a standard PDF from an image storage

// Create an image storage to save the captured document images to
NSURL *imagesURL = [[SBSDKStorageLocation applicationDocumentsFolderURL]
SBSDKStorageLocation *imagesLocation = [[SBSDKStorageLocation alloc] initWithBaseURL:imagesURL];
SBSDKIndexedImageStorage *imageStorage = [[SBSDKIndexedImageStorage alloc] initWithStorageLocation:imagesLocation];

// Define the indices of the images in the image storage you want to render to the PDF, e.g. the first 3.
// To include all images you can simply pass nil for the indexSet. The indexSet internally is validated.
// You dont need to take care if all indices are valid.
NSIndexSet *indexSet = [NSIndexSet indexSetWithIndexesInRange:NSMakeRange(0, 2)];

// Specify the file URL where the PDF will be saved to. Nil here makes no sense.
NSURL *outputPDFURL = @"outputPDF";

// Enqueue the operation and store the SBSDKProgress to watch the progress or cancel the operation.
// After completion the PDF is stored at the URL specified in outputPDFURL.
// You can also extract the image store and the PDF URL from the resultInfo.

SBSDKProgress *progress =
[SBSDKPDFRenderer renderImageStorage:imageStorage
                   completionHandler:^(BOOL finished, NSError *error, NSDictionary *resultInfo) {
    if (finished && error != nil) {
        SBSDKIndexedImageStorage *completedImageStore = resultInfo[SBSDKResultInfoImageStorageKey];
        NSURL *completedPDFURL = resultInfo[SBSDKResultInfoDestinationFileURLKey];

5. Optical Character Recognition

The Scanbot OCR feature is based on the Tesseract OCR engine with some modifications and enhancements. The Scanbot SDK uses an optimized custom library of the Tesseract OCR under the hood and provides a convenient API.

For each desired OCR language a corresponding .traineddata file (aka. tessdata) must be installed in the optional resource bundle named ScanbotSDKOCRData.bundle. Also the special data file osd.traineddata is required and must be installed. It is used for orientation and script detection.

The ScanbotSDK.framework itself does not contain any OCR language files to keep the framework small in size. The optional bundle ScanbotSDKOCRData.bundle, provided in the ZIP archive of Scanbot SDK, contains the language files for English and German as well as the osd.traineddata as examples. You can replace or complete these language files as needed. Add this bundle to your project and make sure that it is copied along with your resources into your app.

Preconditions to achieve a good OCR result

Conditions while scanning

A perfect document for OCR is flat, straight, doesn’t show large shadows, folds, or any other objects that could distract it and is in the highest possible resolution. Our UI and algorithms do their best to help you meet these requirements. But as in photography, you can never fully get the image information back that was lost during the shot.


You can use multiple languages for OCR. But since the recognition of characters and words is a very complicated process, increasing the number of languages lowers the overall precision. With more languages, there are more results where the detected word could match. We suggest using as few languages as possible. Make sure that the language you’re trying to detect is supported by the SDK and added to the project.

Size and position

Put the document on a flat surface. Take the photo from straight above in parallel to the document to make sure that the perspective correction doesn’t need to fix much. The document should fill out the camera frame while still showing all of the text that needs to be recognized. This results in more pixels for each character that needs to be detected and hence, more detail. Skewed pages decrease the recognition quality.

Light and shadows

More ambient light is always better. The camera takes the shot at a lower ISO value, which results in less grainy photos. You should make sure that there are no visible shadows. If you have large shadows, it’s better to take the shot at an angle instead. That’s why we also don’t recommend to use the flashlight. From this low distance, it creates a light spot at the center of the document, which decreases the quality.


The document needs to be properly focused so that the characters are sharp and clear. The auto-focus of the camera works well if you meet the minimum required distance for the lens to be able to focus. Which usually starts at 5-10cm.


The OCR trained data is optimized for common serif and sans-serif font types. Decorative or script fonts decrease the quality of the detection a lot.

Implementing OCR

Download OCR files

You can find a list of all supported OCR languages and download links on this Tesseract wiki page.

⚠️️️ Please choose and download the proper version of the language data files:


The SBSDKOpticalTextRecognizer takes one or more images and performs various text related operations on each of the images:

  • Page layout analysis
  • Text recognizing
  • Creation of searchable PDF documents with selectable text

The page layout analysis returns information about page orientation, an angle the image should be rotated to deskew it, the text writing direction or the text line order.

The text recognizing operations take either a collection of images (SBSDKImageStoring) and optionally create a PDF of it, or a single image. The single image operation also can take a rectangle describing which area of the image should be text-recognized. The results found in the completion handlers resultsDictionary contain information about the found text, where the text was found (boundingBox) and what kind of text it is (word, line, paragraph).


All SBSDKOpticalTextRecognizer operations run in a separate serial queue.

The operations completion handlers are called in main thread.

Example code for performing a page layout analysis:

// The file URL of the image we want to analyse.
NSURL *imageURL = [NSURL URLWithString:@"..."];

// Start the page layout analysis and store the returned SBSDKProgress object. This object can be used to cancel
// the operation or to observe the progress. See NSProgress.
// In completion check if we finished without error and extract the analyzer result from the resultInfo dictionary.
// Now we can work with the result.
SBSDKProgress *progress =
[SBSDKOpticalTextRecognizer analyseImagePageLayout:imageURL
                                        completion:^(BOOL finished, NSError *error, NSDictionary *resultInfo) {
    if (finished && !error) {

        SBSDKPageAnalyzerResult *result = resultInfo[SBSDKResultInfoPageAnalyzerResultsKey];

        if (result.orientation != SBSDKPageOrientationUp) {


Example code for performing text recognition on an image:

// The file URL of the image we want to analyse.
NSURL *imageURL = [NSURL URLWithString:@"..."];

// Enqueue the text recognition operation.
// We limit detection to the center area of the image leaving margins of 25% on each side.
// Only use english language to be recognized.
// The returned SBSDKProgress object can be used to cancel the operation or observer the progress.
// Upon completion we extract the result from the resultsDictionary and log the whole recognized text.
// The we enumerate all words and log them to the console together with their confidence values and bounding boxes.
SBSDKProgress *progress =
[SBSDKOpticalTextRecognizer recognizeText:imageURL
                                rectangle:CGRectMake(0.25f, 0.25f, 0.5f, 0.5f)
                               completion:^(BOOL finished, NSError *error, NSDictionary *resultInfo) {

    SBSDKOCRResult *result = resultInfo[SBSDKResultInfoOCRResultsKey];
    NSLog(@"Recognized Text: %@", result.recognizedText);
    for (SBSDKOCRPage *page in result.pages) {
        for (SBSDKOCRResultBlock *word in page.words) {
            NSLog(@"Word: %@, Confidence: %0.0f, Box: %@",

6. Payform Recognition

The SBSDKPayFormScanner class provides functionality to detect and recognize SEPA payforms. It extracts relevant information fields by performing optical text recognition on certain areas of the image, e.g. IBAN, BIC, receiver, amount of money and reference.

This module needs the german language package. See SBSDKOpticalTextRecognizer for language addition.

For performance reasons the scanner is divided into two parts: detection and recognition. The detection only tests whether the scanned image contains a payform or not. The recognizer performs the text extraction and fills the fields.

The common usage is to configure the iPhones camera for full HD video capturing and run the detection part on each incoming frame synchronously in the video capture queue. When the detector returns a positive result the recognizing part runs on the same full HD frame in the same video capture queue and returns the recognizers result.

Example code on how to detect and recognize payforms in the video delegate

- (void)processSampleBuffer:(CMSampleBufferRef)sampleBuffer
           videoOrientation:(AVCaptureVideoOrientation)orientation {

    // Create an SBSDKPayFormScanner.
    // Note: Usually you store it in a property, for demo purposes we create a new one for each frame.
    SBSDKPayFormScanner *scanner = [[SBSDKPayFormScanner alloc] init];

    // Recognize a bank transfer form in the sample buffer.
    SBSDKPayFormRecognitionResult *recognitionResult = [scanner recognizeFromSampleBuffer:sampleBuffer

    // If we have successfuly recognized payform
    if (recognitionResult.recognitionSuccessful) {
        dispatch_async(dispatch_get_main_queue(), ^{
            // Present the recognition results alert on main thread.

7. Barcode Scanner

The Scanbot SDK provides ability to scan and extract content from barcodes and QR codes.

The following barcode formats are currently supported:

1D barcodes

  • EAN_13
  • EAN_8
  • UPC_E
  • CODE_39
  • CODE_93
  • CODE_128
  • ITF (Interleaved 2 of 5)

2D barcodes

  • PDF_417

Integration: See our Example App how to integrate the barcode scanner.

8. Data Scanner

The Scanbot SDK comes with separate scanners for many specific use cases. Use cases that are not covered by any of these specialized scanners you can tackle with the Data Scanner module. This modules main class is SBSDKGenericTextLineRecognizer. You can configure its behaviour using the SBSDKGenericTextLineRecognizerConfiguration class.

Within a user-defined rectangular area of interest in consecutive video frames the Data Scanner recognizes text (OCR). A customizable block lets you clean up the raw string by filtering it against unwanted characters and OCR noise.
Additionally you can validate the result using pattern-matching or another block.

The Data Scanner returns a SBSDKGenericTextLineRecognizerResult object when it recognized text. This result contains the cleaned-up string as well as a boolean flag that tells you whether the validation was successful or not.

Use cases for the Data Scanner module are the recognition of single-line text like IBAN numbers, insurance numbers, dates and other textual data fields that can be easily validated resp. pattern-matched.

How is the Data Scanner different to regular OCR? In short, it is more reliable and robust, with a higher confidence in text recognition because it accumulates the results of multiple video frames as well as your input from the raw text clean up block.

To make the integration painless for you, Scanbot SDK provides a simple-to-use plugin-viewcontroller named SBSDKGenericTextLineRecognizerViewController that takes over the camera handling, displays the area-of-interest and runs the Data Scanner. The scanners results are passed to a delegate.

9. Image Storage

The ScanbotSDK comes with built-in image storage. You can store scanned images using either keyed or indexed storage. We provide convenient interfaces for this via SBSDKKeyedImageStorage and SBSDKIndexedImageStorage classes.

SBSDKKeyedImageStorage is a simple thread-safe multiple-reader-single-writer key-value fashioned disk image cache class. Manages images in a dictionary-like fashion.

SBSDKIndexedImageStorage is a simple thread-safe multiple-reader-single-writer index based disk image cache class. Manages images in an array-like fashion.

Both classes support PNG and JPEG image file formats represented in SBSDKImageFileFormat Enumeration.

Both classes support encryption, so your data is stored securely. We provide built-in support for AES128 and AES256 encryption. If these algorithms do not meet your requirements you can create your own encrypter by implementing a class conforming to the protocol SBSDKStorageCrypting.

For easier access to device’s file system ScanbotSDK provides convenient helper class SBSDKStorageLocation.

Example of creating indexed image storage using SBSDKStorageLocation and built-in AES256 encrypter:

//Create Url to directory where you plan to store images
NSURL *imagesDirectoryURL = [[SBSDKStorageLocation applicationDocumentsFolderURL]

// Create storage location using Url to images directory          
SBSDKStorageLocation *storageLocation = [[SBSDKStorageLocation alloc] initWithBaseURL:imagesDirectoryURL];

//Create encrypter with key and desired encryption mode
SBSDKAESEncrypter *encrypter = [[SBSDKAESEncrypter alloc] initWithKey:@"password_example#42"

//Create indexed image storage using storage location, image file format and encrypter                         
SBSDKIndexedImageStorage *imageStorage = [[SBSDKIndexedImageStorage alloc] initWithStorageLocation:storageLocation

Examples of some basic operations on Indexed Image storage:

//In this example image storage used as stored property
//Note: How initiate indexed image storage described in example above
@property (strong, nonatomic) SBSDKIndexedImageStorage *imageStorage;

//Create image
UIImage *image = [UIImage imageNamed:@"testImage"];

//Create index
NSInteger index = 0;

//Get image at index
UIImage *storedImage = [self.imageStorage imageAtIndex:index];

//Add image to storage
BOOL isAdded = [self.imageStorage addImage:image];

//Insert image at index
BOOL isInserted = [self.imageStorage insertImage:image atIndex:index];

//Remove image at index
[self.imageStorage removeImageAtIndex:index];

//Move image from index to index

//Create new index to move image to
NSInteger newIndex = 1;

BOOL isMoved = [self.imageStorage moveImageFromIndex:index toIndex:newIndex];

//Replace image at index with another image

//Create image to replace stored one
UIImage *newImage = [UIImage imageNamed:@"newTestImage"];

BOOL isReplaced = [self.imageStorage replaceImageAtIndex:index withImage:newImage];

Examples of some basic operations on Keyed Image storage:

//In this example image storage used as stored property
//Note: How initiate keyed image storage described in example above
@property (strong, nonatomic) SBSDKKeyedImageStorage *imageStorage;

//Create image
UIImage *image = [UIImage imageNamed:@"testImage"];

//Create key
NSString *key = @"testKey";

//Set image for key
[self.imageStorage setImage:image forKey:key];

//Get image for key
UIImage *storedImage = [self.imageStorage imageForKey:key];

//Remove image for key
[self.imageStorage removeImageForKey:key];

//Remove images for keys matching prefix

//Create prefix
NSString *prefix = @"test";

[self.imageStorage removeImagesForKeysMatchingPrefix:prefix];