Modules in Detail

1. Document Detection

SBSDKDocumentDetector uses digital image processing algorithms to find rectangular, document like, polygons in a digital image.

- (SBSDKDocumentDetectorResult *)detectDocumentPolygonOnImage:(UIImage *)image
visibleImageRect:(CGRect)visibleRect
smoothingEnabled:(BOOL)smooth
useLiveDetectionParameters:(BOOL)liveDetection

As input an UIImage object or a CMSampleBufferRef is accepted. Typically the camera API in Apples AVFoundation framework returns CMSampleBufferRef objects. This way you can easily detect document polygons on a live camera video stream or still image shot. Images from the photo library are usually converted to UIImage objects before being passed to the detector.

The visibleRect parameter lets you limit the detection area on the image. If you pass CGRectZero or an a rectangle with zero width or height the whole image is used for detection. The rectangle must be provided in unit coordinate system { 0.0f, 0.0f } - { 1.0f, 1.0f }, where { 0.0f, 0.0f } is the top left and { 1.0f, 1.0f } the bottom right corner of the image. The detector ignores edges that are outside the visibleRect.

The smooth parameter is typically used for realtime detection using the same detector instance all the time. When set to YES, consecutive detected document polygons within a certain timeframe are cumulated into a single dampened polygon. It prevents jumping polygons in situations where the recognized edges changes from video frame to video frame. If you use smoothing you should also observe the devices motion and clear the consecutive polygon buffer if significant motion was detected. This is done by calling the detectors -resetSmoothingData method.

The liveDetection flag you tpyically set to YES when you need the fastest detection possible for e.g. realtime detection. The results are a little bit less accurate than when set to NO but the performance significantly rises. With liveDetection enabled an iPhone 6 can detect up to 20 video frames per second, depending on the video frame resolution. It is recommended to turn off liveDetection if you want to detect only once on a static image.

The result contains an SBSDKPolygon (or nil if nothing was found) and an SBSDKDocumentDetectionStatus enum member. For user guidance during live detection the status can be used. It tells you whether the detected polygon is too small, the perspective is insufficient or the orientation is not optimal. If there was no polygon detected the status might contain information about why there was no polygon detected (too noisy background, too dark).

For manual realtime detection with the device’s camera also take a look at the following classes:

Example code for document detection on a video frame

- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection {

// Create an SBSDKDocumentDetector.
// Note: Usually you store it in a property, for demo purposes we create a new one for each frame.
SBSDKDocumentDetector *detector = [[SBSDKDocumentDetector alloc] init];

// Detect a documents outlines on the sample buffer
SBSDKDocumentDetectorResult *result = [detector detectDocumentPolygonOnSampleBuffer:sampleBuffer
visibleImageRect:CGRectZero
smoothingEnabled:YES
useLiveDetectionParameters:YES];

// If we have an acceptable polygon...
if (result.status == SBSDKDocumentDetectionStatusOK && result.polygon != nil) {
// We take a still shot from the camera.
// Create an UIImage from the still shot sample buffer.
// Warp the image into the polygon.
[self.cameraSession captureStillImageWithCompletionHandler:^(CMSampleBufferRef sampleBuffer, NSError *error) {
UIImage *image = [UIImage sbsdk_imageFromSampleBuffer:sampleBuffer]; 
image = [image sbsdk_imageWarpedByPolygon:result.polygon imageScale:1.0];
}];
}
}

2. User Interface for guided, automatic Document Scanning

See SBSDKScannerViewController.

For your convenience Scanbot SDK comes with a view controller subclass that handles all the camera and detection implementation details for you. Additionally it provides UI for Scanbots document scanning guidance as well as the functionality and UI for manual and automatic shutter release.

The controllers delegate can customize the appearance and behavior of the guidance UI. Further SBSDKScannerViewController gives its delegate control over how and when frames are analyzed and, most important, it delivers the scanned (and perspective corrected, cropped document) images to its delegate.

See SBSDKScannerViewControllerDelegate for customization of UI and behavior.

Example code

#import "SBSDKScanbotSDK.h"

@interface DemoViewController : UIViewController
@end

@interface DemoViewController() <SBSDKScannerViewControllerDelegate>
@property (strong, nonatomic) SBSDKImageStorage *imageStorage;
@property (strong, nonatomic) SBSDKScannerViewController *scannerViewController;
@property (assign, nonatomic) BOOL viewAppeared;
@end

@implementation DemoViewController

- (void)viewDidLoad {
[super viewDidLoad];

// Create an image storage to save the captured document images to
self.imageStorage = [[SBSDKImageStorage alloc] init];

// Create the SBSDKScannerViewController.
// We want it to be embedded into self.
// As we do not want automatic image storage we pass nil here for the image storage.
self.scannerViewController 
= [[SBSDKScannerViewController alloc] initWithParentViewController:self imageStorage:nil];

// Set the delegate to self.
self.scannerViewController.delegate = self;

// We want unscaled images in full 5 or 8 MPixel size.
self.scannerViewController.imageScale = 1.0f;
}

- (void)viewWillDisappear:(BOOL)animated {
[super viewWillDisappear:animated];
self.viewAppeared = NO;
}

- (void)viewDidAppear:(BOOL)animated {
[super viewDidAppear:animated];
self.viewAppeared = YES;
}

- (BOOL)shouldAutorotate {
// No autorotations.
return NO;
}

- (NSUInteger)supportedInterfaceOrientations {
// Only portrait.
return UIInterfaceOrientationMaskPortrait;
}

- (UIStatusBarStyle)preferredStatusBarStyle {
// White statusbar.
return UIStatusBarStyleLightContent;
}

#pragma mark - SBSDKScannerViewControllerDelegate

- (BOOL)scannerControllerShouldAnalyseVideoFrame:(SBSDKScannerViewController *)controller {
// We want to only process video frames when self is visible on screen and front most view controller
return self.viewAppeared && self.presentedViewController == nil;
}

- (void)scannerController:(SBSDKScannerViewController *)controller
didCaptureDocumentImage:(UIImage *)documentImage {
// Here we get the perspective corrected and cropped document image after the shutter was (auto)released.
// We store it into our image storage.
[self.imageStorage addImage:documentImage];
}

- (void)scannerController:(SBSDKScannerViewController *)controller didCaptureImage:(CMSampleBufferRef)sampleBuffer {
// Here we get the full image from the camera. We could run another manual detection here or use the latest
// detected polygon from the video stream to process the image with.
}

- (void)scannerController:(SBSDKScannerViewController *)controller
didDetectPolygon:(SBSDKPolygon *)polygon
withStatus:(SBSDKDocumentDetectionStatus)status {
// Everytime the document detector finishes detection it calls this delegate method.

}

- (UIView *)scannerController:(SBSDKScannerViewController *)controller 
viewForDetectionStatus:(SBSDKDocumentDetectionStatus)status {

// Here we can return a custom view that we want to use to visualize the latest detection status.
// We return nil for now to use the standard label.
return nil;
}

- (UIColor *)scannerController:(SBSDKScannerViewController *)controller 
polygonColorForDetectionStatus:(SBSDKDocumentDetectionStatus)status {

// If the detector has found an acceptable polygon we show it with green color
if (status == SBSDKDocumentDetectionStatusOK) {
return [UIColor greenColor];
}
// Otherwise red colored.
return [UIColor redColor];
}

@end

See SBSDKScannerViewControllerDelegate for details.

3. Image Processing

See SBSDKImageProcessor.

Digital image processing is a core part of Scanbot SDK. Basically there are three operations on images:

  • Rotation
  • Image filtering
  • Image warping (perspective correction and cropping) into a 4-sided polygons shape

All of these image operations can be called either synchronously in any thread or queue or asynchronously on a special serial image processing queue. When working with large images it is highly recommended to make use of the asynchronous API as here no parallel processing of images is possible. Processing large images concurrently easily causes memory warnings and crashes.

Synchronous API can be found in the UIImageSBSDK class extension.

The asynchronous API is implemented as static class SBSDKImageProcessor. Additionally to the three standard operations SBSDKImageProcessor provides a method to apply custom image processing by specifying an SBSDKImageProcessingHandler block. Execution is also dispatched to the image processing queue. The operations completion handlers are called in main thread.

Each call into the asynchronous API returns a SBSDKProgress object to you. This NSProgress subclass can be used to observe the progress of the operation but also it can be used to cancel the operation via the -(void)cancel method.

Example code for custom asynchronous image filter

// Specify the file URL for the input image
NSURL *inputImageURL = ...;

// Specify the file URL the output image is written to. Set to nil, if you dont want to save the output image
NSURL *outputImageURL = ...;

// Create the image processing block
SBSDKImageProcessingHandler processingHandler = ^UIImage *(UIImage *sourceImage, NSError **outError) {
// Apply a color filter to the input image,
UIImage *filteredImage = [sourceImage imageFilteredBy:SBSDKImageFilterTypeColor];

// and return the filtered image.
return filteredImage;
};

// Call the asynchronous image processing API. The returned progress object can be used to to cancel the operation.
// Once the operation has been completed extract the image from resultInfo dictionary and do whatever you want with the image.
SBSDKProgress *progress
= [SBSDKImageProcessor customFilterImage:inputImageURL
processingBlock:processingHandler
outputImageURL:outputImageURL
completion:^(BOOL finished, NSError *error, NSDictionary *resultInfo) {
UIImage *outputImage = resultInfo[SBSDKResultInfoDestinationImageKey];
}];

Example code for detecting and applying a polygon to an image

// Specify the input image URL.
NSURL *inputImageURL = ...;

// Specify the output image URL. Set to nil, if you dont want to save the output image.
NSURL *outputImageURL = ...;

// Create a document detector.
SBSDKDocumentDetector *detector = [[SBSDKDocumentDetector alloc] init];

// Let the document detector run on our input image.
SBSDKDocumentDetectorResult *result
= [detector detectDocumentPolygonOnImage:[UIImage imageWithContentsOfFile:inputImageURL.path]
visibleImageRect:CGRectZero
smoothingEnabled:NO
useLiveDetectionParameters:NO];

// Check the result.
if (result.status == SBSDKDocumentDetectionStatusOK && result.polygon != nil) {

// If the result is an acceptable polygon, we warp the image into the polygon asynchronously.
// When warping is done we check the result and on success we pick up the output image.
// Then do whatever you want with the warped image.
[SBSDKImageProcessor warpImage:inputImageURL
polygon:result.polygon
outputImageURL:outputImageURL
completion:^(BOOL finished, NSError *error, NSDictionary *resultInfo) {
if (finished && !error) {
UIImage *outputImage = resultInfo[SBSDKResultInfoDestinationImageKey];
}
}];
} else {
// No acceptable polygon found.
}
}

4. PDF Creation

The SBSDKPDFRenderer static class takes an image storage (SBSDKImageStorage) and renders them into a PDF. For each image a page is generated. The generated pages have sizes that correspond to DIN A4, US Letter or Custom. As the images are embedded unscaled the resolution for each page depends on its image. When rendering into a DIN A4 or US Letter format the orientation of the page; landscape or portrait; is derived from the images aspect ratio.

See SBSDKPDFRendererPageSize for further information.

The operations completion handler is called in main thread.

Example code for creating a standard PDF from an image storage

// Create or use an exisiting image storage.
SBSDKImageStorage *imageStorage = [SBSDKImageStorage temporaryStorageWithImages:...];

// Define the indices of the images in the image storage you want to render to the PDF, e.g. the first 3.
// To include all images you can simply pass nil for the indexSet. The indexSet internally is validated.
// You dont need to take care if all indices are valid.
NSIndexSet *indexSet = [NSIndexSet indexSetWithIndexesInRange:NSMakeRange(0, 2)];

// Specify the file URL where the PDF will be saved to. Nil here makes no sense.
NSURL *outputPDFURL = ...;

// Enqueue the operation and store the SBSDKProgress to watch the progress or cancel the operation.
// After completion the PDF is stored at the URL specified in outputPDFURL.
// You can also extract the image store and the PDF URL from the resultInfo.

SBSDKProgress *progress = 
[SBSDKPDFRenderer renderImageStorage:imageStorage
copyImageStorage:YES
indexSet:indexSet
withPageSize:SBSDKPDFRendererPageSizeAutoLocale
output:outputPDFURL
completionHandler:^(BOOL finished, NSError *error, NSDictionary *resultInfo) {
if (finished && error != nil) {
SBSDKImageStorage *completedImageStore = resultInfo[SBSDKResultInfoImageStorageKey];
NSURL *completedPDFURL = resultInfo[SBSDKResultInfoDestinationFileURLKey];
}
}];

5. Optical Character Recognition

The Scanbot OCR feature is based on the Tesseract OCR engine with some modifications and enhancements. The Scanbot SDK uses an optimized custom library of the Tesseract OCR under the hood and provides a convenient API.

For each desired OCR language a corresponding .traineddata file (aka. tessdata) must be installed in the optional resource bundle named ScanbotSDKOCRData.bundle. Also the special data file osd.traineddata is required and must be installed. It is used for orientation and script detection.

The ScanbotSDK.framework itself does not contain any OCR language files to keep the framework small in size. The optional bundle ScanbotSDKOCRData.bundle, provided in the ZIP archive of Scanbot SDK, contains the language files for English and German as well as the osd.traineddata as examples. You can replace or complete these language files as needed. Add this bundle to your project and make sure that it is copied along with your resources into your app.

Download OCR files

You can find a list of all supported OCR languages and download links on this Tesseract wiki page.

⚠️️️ Please choose and download the proper version of the language data files:

OCR API

The SBSDKOpticalTextRecognizer takes one or more images and performs various text related operations on each of the images:

  • Page layout analysis
  • Text recognizing
  • Creation of searchable PDF documents with selectable text

The page layout analysis returns information about page orientation, an angle the image should be rotated to deskew it, the text writing direction or the text line order.

The text recognizing operations take either a collection of images (SBSDKImageStorage) and optionally create a PDF of it, or a single image. The single image operation also can take a rectangle describing which area of the image should be text-recognized. The results found in the completion handlers resultsDictionary contain information about the found text, where the text was found (boundingBox) and what kind of text it is (word, line, paragraph).

Examples

All SBSDKOpticalTextRecognizer operations run in a separate serial queue.

The operations completion handlers are called in main thread.

Example code for performing a page layout analysis:

// The file URL of the image we want to analyse.
NSURL *imageURL = ...;

// Start the page layout analysis and store the returned SBSDKProgress object. This object can be used to cancel
// the operation or to observe the progress. See NSProgress.
// In completion check if we finished without error and extract the analyzer result from the resultInfo dictionary.
// Now we can work with the result.
SBSDKProgress *progress =
[SBSDKOpticalTextRecognizer analyseImagePageLayout:imageURL
completion:^(BOOL finished, NSError *error, NSDictionary *resultInfo) {
if (finished && !error) {

SBSDKPageAnalyzerResult *result = resultInfo[SBSDKResultInfoPageAnalyzerResultsKey];

if (result.orientation != SBSDKPageOrientationUp) {

}
}
}];

Example code for performing text recognition on an image:

// The file URL of the image we want to analyse.
NSURL *imageURL = nil;

// Enqueue the text recognition operation.
// We limit detection to the center area of the image leaving margins of 25% on each side.
// Only use english language to be recognized.
// The returned SBSDKProgress object can be used to cancel the operation or observer the progress.
// Upon completion we extract the result from the resultsDictionary and log the whole recognized text.
// The we enumerate all words and log them to the console together with their confidence values and bounding boxes.
SBSDKProgress *progress =
[SBSDKOpticalTextRecognizer recognizeText:imageURL
rectangle:CGRectMake(0.25f, 0.25f, 0.5f, 0.5f)
languageString:@"eng"
completion:^(BOOL finished, NSError *error, NSDictionary *resultInfo) {

SBSDKOCRResult *result = resultInfo[SBSDKResultInfoOCRResultsKey];
NSLog(@"Recognized Text: %@", result.recognizedText);
for (SBSDKOCRResultBlock *word in result.words) {
NSLog(@"Word: %@, Confidence: %0.0f, Box: %@",
word.text,
word.confidenceValue,
NSStringFromCGRect(word.boundingBox));
}
}];

6. Payform Recognition

The SBSDKPayFormScanner class provides functionality to detect and recognize SEPA payforms. It extracts relevant information fields by performing optical text recognition on certain areas of the image, e.g. IBAN, BIC, receiver, amount of money and reference.

This module needs the german language package. See SBSDKOpticalTextRecognizer for language addition.

For performance reasons the scanner is divided into two parts: detection and recognition. The detection only tests whether the scanned image contains a payform or not. The recognizer performs the text extraction and fills the fields.

The common usage is to configure the iPhones camera for full HD video capturing and run the detection part on each incoming frame synchronously in the video capture queue. When the detector returns a positive result the recognizing part runs on the same full HD frame in the same video capture queue and returns the recognizers result.

Example code on how to detect and recognize payforms in the video delegate

- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection {

// Create an SBSDKPayFormScanner.
// Note: Usually you store it in a property, for demo purposes we create a new one for each frame.
SBSDKPayFormScanner *scanner = [[SBSDKPayFormScanner alloc] init];

// Get the video orientation from the camera session (of the SBSDKScannerViewController).
AVCaptureVideoOrientation videoOrientation = self.scannerViewController.cameraSession.videoOrientation;

// Detect bank transfer form in the sample buffer.
SBSDKPayFormDetectionResult *detectionResult = [scanner detectInSampleBuffer:sampleBuffer 
orientation:videoOrientation];

// If we have detected a valid payform
if (detectionResult.isValidPayForm) {

// We perform the recognition step.
SBSDKPayFormRecognitionResult *recognitionResult = [scanner recognizeFieldsInSampleBuffer:sampleBuffer
orientation:videoOrientation];

dispatch_async(dispatch_get_main_queue(), ^{
// Present the recognition results alert on main thread.
});
}
}

7. Barcode Scanner

The Scanbot SDK provides ability to scan and extract content from barcodes and QR codes.

The following barcode formats are currently supported:

1D barcodes

  • EAN_13
  • EAN_8
  • UPC_E
  • CODE_39
  • CODE_93
  • CODE_128
  • ITF (Interleaved 2 of 5)

2D barcodes

  • QR_CODE
  • DATA_MATRIX
  • AZTEC
  • PDF_417

Integration: See our Example App how to integrate the barcode scanner.