Print

Building the Business Case for Voice Technology

Tom Singer
Tompkins Associates

February 2010
www.tompkinsinc.com

© 2010 Tompkins Associates Sponsored by Vocollect, Inc.

Introduction

Voice technology has great appeal for users of SAP who want to extend their logistics and fulfillment processes. It offers the promise of hands-free, eyes-free wireless access to the information needed to drive key warehouse processes and has become an important ingredient in the success of a company’s IT strategy.

Before Voice, no other technology had a greater impact on the evolution of warehouse management systems (WMSs) than the wireless local area network (LAN), and mobile or Radio Frequency (RF) terminal. While they are popular with many organizations, RF terminals and barcode scanning do have some drawbacks. They require operators to use their hands to scan and key data. They also require operators to read instructions on terminal displays. For many operations, these activities disrupt the normal flow of warehousing and limit the benefits provided by the technology.

Despite these drawbacks, LANs and RF have provided the opportunity for many distribution operations to substantially increase accuracy, productivity, visibility, and control of core warehousing functions. In many ways, the integration of RF communications and barcode scanning into WMS solutions in the 1990s put information and data collection into the hands of warehouse floor personnel.

Like traditional RF-based barcode scanning terminals, Voice solutions center on a small, wireless mobile computing device that warehouse associates typically wear on a belt. The difference is that Voice delivers application instructions verbally through a headset and captures worker responses through a microphone – with no stopping to look at a screen, key in a quantity, or scan a barcode.

Voice Becomes Mainstream

Voice is not new to SAP users. But until recently, it has been more of a niche application than a mainstream solution offering. This has changed as the technology has matured and evolved and as SAP users have focused more on improving their logistics and fulfillment processes. Voice technology now plays a major role within the warehouse and distribution center for users of SAP and non-SAP systems.

The key drivers in this movement are:

  • Proliferation of Wireless LANs - Warehousing has been at the forefront of the development of wireless LANs since the late 1980s. But early deployments of this technology were costly, custom propositions. The evolution of 802.11 standards propelled wireless LANs from custom to commonplace. Today, most top and mid- tier distribution operations use RF terminals with barcode scanners over 2.4 GHz wireless LANs.

    Voice logistics solutions can share this common backbone with traditional RF scanning applications. Even operations that currently do not use RF scanning terminals are not put off by the prospect of installing an 802.1 1x (a,b,g) wireless network, since the technology has become commonplace in this increasingly mobile society.

  • >More Powerful Mobile Devices and Standardization - The processing power of mobile computers and terminals has grown dramatically since the late 1990s. This, coupled with the continued evolution of Microsoft’s family of mobile operating systems, has allowed Voice vendors to offer powerful and robust solution sets. The Windows-Embedded product line provides hardware and software suppliers with a standard operating system to build upon.

  • More Vendors and Solutions - The number of vendors offering Voice logistics solutions has steadily grown over the past few years. Voice logistics providers today mostly team up with WMS vendors, offering a standard jointly developed direct interface between the Voice vendor’s client software and the vendor’s WMS. The VoiceDirect ERP for SAP WM and EWM from Vocollect is an example of such a product offering. Under this paradigm, a WMS package can support Voice functionality out-of-the-box, just like RF, with their Voice client communicating directly with the WMS application. For the SAP user, the ability to directly interface with the SAP Netweaver application integration layer adds another level of support for seamless system to system integration.

Going Beyond Promises to Proven Potential

More choice, greater flexibility and a standard infrastructure – it is no wonder that interest in this technology continues to grow. Voice now offers widely documented evidence of improved performance and rapid return on investment. Trade journals, vendor product literature, and web sites are full of customer case studies and client testimonials that make a compelling business case for the technology. Anyone who has worked with the technology knows that Voice has the potential to deliver on its promises.

But this does not mean that Voice is the right solution for every operation. While the benefits reported by one organization may be enticing, they may be much more limited or unobtainable for another operation. Also, implementation costs can vary significantly depending upon legacy fulfillment or warehousing system, existing network infrastructure, and operational requirements.

So how does Voice actually stack up in the warehouse? Are vendor claims about benefits and quick ROI really valid? Obviously, the answers to these questions will vary based on the nature of the operation considering the technology. As with any technology investment, Voice implementation needs to be built on a solid business case in order to truly succeed. The depth and components of this justification can vary across operations. But it should always start with a thorough understanding of the operational and business requirements.

It also needs to be based on a basic understanding of Voice’s:

  • Typical usages and alternatives;

  • Prospective benefits measured against the alternatives;

  • Key technology components and application integration; and

  • Cost factors and implementation approaches.

Building a solid business case may require a fair amount of effort beyond this basic evaluation framework. Detailing benefits and costs to the extent necessary to adequately determine ROI takes work. But even if initial findings indicate that Voice is currently not practical for a specific operation, it will not be a wasted effort. While Voice may not be justifiable today, it may well be viable tomorrow for certain organizations.

Key Takeaways:

  • The key drivers of Voice’s increasing role in the warehouse are the proliferation of wireless LANs, the standardization of powerful mobile devices, the availability of more vendors and solutions, and the use of multi-modal devices.

  • Voice can allow for more choice, flexibility, better price points, standardization of infrastructure, improved performance, and rapid return on investment.

  • How Voice stacks up in the warehouse and the validity of vendor claims depends on the typical usages and alternatives, benefits measured against alternatives, key technology components and application integration, as well as cost factors and available implementation approaches.

Typical Uses and Alternatives

Voice is typically employed to support tasks such as order selection, put-away, replenishment and cycle counting within the warehouse. Industries with a high degree of human touch, such as Grocery and Food and Beverage, were early to embrace Voice technology. But, Voice has made significant inroads in other industry segments, including Automotive and Service Parts, Personal Care, Office Supplies, Food Manufacturing, Industrial/MRO Hard Goods Wholesale Distribution, Healthcare and Pharmaceuticals, and Specialty Retail. Picking, or selection, remains a core focal point of interest for most Voice applications, given its proportion of overall labor activity in the DC and its direct impact on customer service levels. Typically, selection is the key component of establishing a business case for Voice in the warehouse.

The most commonly cited Voice alternatives are paper/label processing, RF terminals with barcode scanning, and pick-to-light.

  • Paper/label processing is typically coupled with after-the-fact data entry using desktop terminals. Associates perform warehouse tasks off of pick lists, put-away labels, printed VAS instructions, and other paper documents. Upstream processes (such as how the information is sorted on the documents), and downstream processes (such as scan and verify on a desktop terminal), directly impact paper/label processing’s performance and functionality.

    Paper/label processing is a good fit for many warehouses, especially smaller operations with relatively straightforward transaction requirements. Even operations that rely on RF scanning for the bulk of transactions usually employ paper/label processing for some functions. It can be purely a manual proposition or part of an automatic flow, such as a label case pick-to-belt, where the pick is confirmed by an in-line conveyor scan.

  • RF scanning terminals have been considered a prerequisite for larger, more complex operations. But RF scanning can be found in all different types and sizes of operations primarily due to direct support by most warehouse management systems. Even operations running non-RF enabled legacy fulfillment systems can turn to automated data collection software for this functionality.

    RF scanning offers some distinct advantages over paper/label processing. It can provide positive verification that the warehouse associate is at the right location or picked the correct SKU through a barcode scan or key entry. Work can be pushed out to associates based on location and task priority instead of handed out from a manually managed queue. Transaction data is captured in real time as associates perform tasks. Furthermore, RF scanning makes some functions like multi-order cart selection possible or more practical than paper/label processing.

  • Pick-to-Light (PTL) remains a popular selection technology due to its ability to support high pick rates and its ease-of-use. It is typically used in a zone-based, pick and pass flow where an associate scans a tote or carton barcode label. (The PTL software activates light displays for every location that shows the required quantity needed for the tote or carton.) The associate walks the zone, selecting SKUs and confirming picks by pressing display buttons. Pick quantities can be shorted or increased by button presses. Displays can also be provided to show SKU, order, or other relevant information. Some vendors even have LCD displays that show SKU pictures.

    PTL technology has a number of different variations. Instead of a full quantity display with confirmation buttons per location, a simple light indicator can be provided for each pick face with quantity shown and confirmed on a bay display. This configuration is generally employed in pick modules with slower moving SKUs. The technology can be used to support put-to-light packing, in which an associate scans a case barcode and the PTL software identifies all the staged cartons that require the case’s SKU and associated quantity.

    Some vendors offer PTL “smart” carts, meaning that totes or cartons are associated with light-enabled cart slots. Associates push these carts through the pick module based on the location shown on the cart’s light display. Once the location is confirmed through a wireless barcode scan, the PTL software illuminates the quantity needed for each slot requiring the SKU.

    Also, as its name implies, PTL technology is about the order selection process. Unlike the other technologies discussed in this paper, it is not employed to drive other warehousing functions such as receiving, put-away, and cycle counting. This means any investment in the technology cannot be leveraged beyond the confines of the PTL module and order selection process.

There are many other data collection and material handling technologies that are used to drive warehouse processes. But paper/label, RF, Voice, and PTL remain the most popular selection technologies. This observation is borne out by a recent Supply Chain Consortium survey (Figure 1). So it is no surprise that Voice vendors typically highlight their wares against the other three pick methods.

Selection Technologies

image

Figure 1. Selection Technologies Used by Survey Respondents, as Reported by the Supply Chain Consortium

Key Takeaways:

  • Voice is initially used in selection operations, which generally makes it a key component for developing a business case for Voice. However, some use voice for receiving, put-away, replenishment, and cycle counting.

  • The most popular alternatives to Voice are paper/label processing, RF terminals with barcode scanning, and pick-to-light.

Weighing the Potential Benefits of Voice

Vendors point to a variety of potential benefits for employing Voice within the distribution center. They typically provide metrics for these improvements based on actual case study data from their client base. Moreover, the quantitative benefits of Voice and associated metrics have been well documented in numerous trade journal articles and white papers.

These reported improvements or reductions are usually impressive, but must be viewed within the context of before and after points. They must also be examined against the nature of the operation, product being handled, and systems involved. However, testimonials do provide a general indication of what Voice can do in the warehouse.

Typical Vendor Data

While classifications and measurements may vary between case study and web site, they fall along the following lines:

  • Increased productivity and pick rates;

  • Reduced errors and increased accuracy;

  • Improved throughput and fill rate;

  • Reduced supply costs;

  • Improved control and visibility;

  • Decreased training time;

  • Improved safety;

  • Reduced damage and breakage;

  • Faster worker training; and

  • Enhanced worker satisfaction.

Voice vendor web sites provide case study data quantifying many of these benefits, especially productivity and accuracy gains. Reported productivity increases usually range from 8-40%. Occasionally, higher increases may occur. Using RF scanning will see higher productivity gains than with paper.

Most vendors report one or more customers who have at least doubled pick rates. Generally, the featured operation for which the performance measurement is provided is moving from paper or label-based selection to Voice, with a growing number of case studies based on implementations replacing RF scanning or pick-to-light.

Accuracy rates typically cited in these case studies are at least 99.5%, with most reporting higher rates. Corresponding reported reduction in pick error rates range from 80-100%. Some studies detail significant cost savings in supplies (moving from label to Voice selection) and increased fill rates due to reductions in miss-picks.

Figures for other benefits, such as improved safety and reduction in breakage, are also obtained, but rarely used, in the justification and analysis effort. Given Voice’s hands- free and heads-up processing flow, these benefits make intuitive sense. Vendors generally showcase customers who obtained an investment payback within 9 to 12 months.

Voice appears to be an attractive investment proposition in the warehouse. But are the numbers realistic for a specific operation? While there is no reason to doubt the numbers, they must be viewed in the context of the starting point and processes involved.

Measuring Gains in Productivity and Accuracy

Potential productivity gains can be quite significant for an operation moving from paper to Voice. In general, these gains are due to a number of factors beyond the hands-free flow of Voice, including:

  • Changes in pick process, such as moving from discrete order selection, using paper pick lists to multi-order cart selection, and using functionality provided by the Voice application software;

  • Reduction in personnel needed for post-pick checking, packing, and auditing, due to positive pick verification of Voice over paper picks; and

  • Real-time information on inventory levels, order status and picker transaction rates provided by the Voice application software.

Depending on the functionality provided by the underlying software, the above factors generally do not play a significant role when comparing RF Scanning to Voice. From a productivity perspective, the comparison between the two pick methods centers more on the hands-free nature of Voice.

The Tompkins Associates White Paper, Order Selection for the 21st Century, Voice vs. Scanning Technology, documents an implementation of the Vocollect Voice solution at Associated Wholesale Grocers (AWG). The implementation covered five pick areas: dairy, dry, freezer, meat, and perishables.

Figure 2 shows the productivity gains after Vocollect Voice was installed. The two areas, dry and freezer, which were previously supported by a paper-based process, experienced modest gains. The areas previously supported by RF scanning saw much higher increases. This is not surprising, because RF scanning can be more disruptive to selection flow, since it typically requires the user to scan or enter information at multiple points.

Equally understandable are the higher gains in the refrigerated meat and dairy areas, where RF scanning terminals can be more difficult to handle.

Area

Old Pick Method

Selection Productivity Gain

Dry

Paper

3%

Freezer

Paper

4%

Produce

RF Scanning

8%

Meat

RF Scanning

12%

Dairy

RF Scanning

15%

Figure 2. Productivity Increases after Voice at AWG

While the relatively modest gain in selection productivity for Voice over paper may be expected, other factors should be considered when comparing paper to other selection technologies. Paper generally requires post-pick data entry, either at a packing or clerical key entry.

Overall productivity gains of moving off of paper need to account for reduction in these efforts. Since picks are not systematically verified as each line is processed, errors are more likely to occur; correcting these errors requires additional labor. Paper also requires manual management that covers preparation, assignment, and post-pick processing. This all entails additional direct and indirect labor that should be considered when quantifying prospective labor productivity gains.

Voice versus RF scanning comparisons should also account for the different types of the RF scanning devices, generally categorized as handheld, wearable, or truck mounted.

Handheld terminals typically require users to holster or set the device down during certain steps in a process. This can add to the overall time to complete a transaction. Wearable units are worn on arms or attached to belts. These lightweight devices capture barcode data through “ring” scanners worn on the index finger. Truck mount units are mounted on material handling equipment such as reach- and order-picker trucks and motorized pallet jacks. Truck mounted devices generally capture barcode data through tethered scanners.

While wearable and truck mounted units do not require users to pick up or lay down the device, warehouse associates still must read the display and key data at certain steps in a process, potentially slowing down the overall transaction time.

Pick-to-light vendor web sites also claim similar benefits for their light-based solutions. As with Voice, productivity and accuracy are typically the cornerstones of any pick-to- light business case. Some vendor web sites cite four- or five-fold productivity improvements over paper-based selection, with individual pick rates approaching 450 lines per hour. While these numbers may seem high, pick-to-light is generally acknowledged as providing the highest pick rate potential of the four selection technologies when pick densities are relatively high.

Some sources assert that Voice provides a greater accuracy potential than RF scanning and pick-to-light. All three technologies can provide significantly lower pick errors than paper, since they all require positive real-time confirmation of the pick. However, some studies report lower error rates with Voice than the other two methods, due to the freeing of hands and eyes from data entry steps.

RF scanning does require the picker to break the flow of the process to perform scans, read displays, and key quantities. Arguably, these breaks in flow can interject errors into the process. But pick-to-light only requires the push of a button to verify the pick. The actual speed of pick-to-light may generate slightly higher error rates than Voice in certain situations, as pickers may concentrate too much on speed at the expense of paying attention to the pick task at hand.

Quantifying Benefits

Case studies can provide a good general indication of the potential of Voice. But they tell stories for specific operations, making them less applicable to any individual distribution center. The potential fit of Voice or any other selection technology is dependent on a variety of underlying factors, including:

  • Order profile – lines per order and units per lines;

  • SKU weight and size;

  • Pick container weight and size;

  • Travel distance between picks;

  • Pick line layout and product accessibility;

  • Special data capture requirements such as lot, batch, serial number, or catch weight;

  • Workforce composition, including percentage of temporary workers;

  • Growth potential and need for flexibility; and

  • Functionality of the supporting software application.

Since these factors can vary across operations, building a business case for Voice on the benefits obtained at other sites can be risky. Benefits can certainly be quantified by conducting pilot tests. On the other hand, pilot programs are generally costly and impractical. This leaves two viable alternatives when quantifying benefits: 1) Using case study data and assumptions or 2) Developing engineer-based analysis of anticipated gains.

  • Case Study Data - Certainly the risk involved in using case study data and generalized assumptions to quantify benefits is a function of how much the target operation differs from the case study operations or falls outside of the “norm” that is the basis for the general assumption. For many operations contemplating Voice, this should be a perfectly acceptable risk, especially if conservative numbers are used. Using 10-12% as the anticipated labor productivity increase in moving from paper or RF scanning to Voice is generally a good rule of thumb. But it does not account for variances in operational flow, layout, product, personnel, and legacy systems.

  • Engineer-based Analysis - Quantifying potential benefits through an engineer- based analysis can account for these variances. This approach breaks down the elemental processes and steps for current and prospective processes. It can probably be best appreciated in the context of developing expected pick rates from predetermined elemental tasks and associated time. Employing this approach for quantifying potential pick rates allows for comparisons between technologies and process flows, as well as accounts for variations in the above factors, if properly done. It is a method requiring specific skill sets in order to produce reliable results and is generally performed by an industrial engineer.

Figure 3 shows an example of the results of a predetermined time element analysis performed for a Tompkins client. It summarizes anticipated case pick rates in cases per hour between paper, RF scanning, and Voice in a refrigerated pick module. Detailed analysis for Voice selection appears in Figure 4. The analysis was developed using time sampling of existing paper-based pick processes, as well as elemental step evaluation for the potential use of RF scanning and Voice. The results show pick rate increases of 6% and 12%, respectively, for moving from paper and RF scanning to Voice.

Pick Technology

Cases/Hour

Paper

196

RF Scanning

184

Voice

209

Figure 3. Anticipated Case Pick Rates in Cooler Module

Selection - Cases

Model

Index

Freq

Factor

Total TMU's

Total Hours

Cases / Hr

Walk to Vehicle

A

1

1

100

100

0.001

Start and Park

S

3

1

100

300

0.003

Transport

T

3

1

100

300

0.003

Load Empty Pallets

L

3

20.96

10

628.8

0.006

Transport-Directed to Location by Voice

T

3

20.96

100

6288

0.063

Load Case onto Pallet

Action Distance

A

3

1048

10

31440

0.314

Body Motion

B

3

1048

10

31440

0.314

Gain Control

G

3

1048

10

31440

0.314

Action Distance

A

3

1048

10

31440

0.314

Body Motion

B

3

1048

10

31440

0.314

Placement

P

6

1048

10

62880

0.629

Action Distance

A

3

1048

10

31440

0.314

Place Label on Case

Action Distance

A

1

1048

10

10480

0.105

Body Motion

B

3

1048

10

31440

0.314

Gain Control

G

3

1048

10

31440

0.314

Action Distance

A

3

1048

10

31440

0.314

Body Motion

B

3

1048

10

31440

0.314

Placement

P

3

1048

10

31440

0.314

Action Distance

A

3

1048

10

31440

0.314

Load - Pick up Pallet

L

10

20.96

100

20960

0.210

Transport - Travel to Next Location

T

1

148

100

14800

0.148

Transport back to Dock w/ Pallets

T

3

20.96

100

6288

0.063

Stop Vehicle

S

6

1

100

600

0.063

Total:

500,905

5.009

209.2

Figure 4. Sample Voice Case Selection Elemental Analysis

Figure 5 shows an example in which pick-to-light and Voice were analyzed for different pick modules and order types. The software solution being evaluated for this operation provided pick-to-light, RF scanning, and Voice selection functionality. The summary results in Figure 5 show that pick-to-light provides a significantly high pick rate for store orders, especially in the carton flow module. But Voice and pick-to-light have compatible pick rates for service orders in shelving. These rates were incorporated into a cost-benefits analysis that recommended deployment of both technologies in separate pick modules.

Store Orders

Service Orders

Carton Flow

Discrete Pick-to-Light

261

132

Discrete PTL Both Directions

278

162

Batch Voice Selection

186

147

Shelving

Discrete Pick-to-Light

178

120

Batch Voice Selection

141

111

Figure 5. Anticipated Pick Rates in Lines/Hour

Pick-to-light presents some fit challenges that go beyond pick rates and raw productivity numbers. It is an inherently more costly and complex technology that typically requires a significantly higher start-up investment and a relatively rigid product flow. Totes and cartons are generally routed between fixed pick zones via a conveyor system. Managing workflow can be an ongoing issue, because of daily workload fluctuations between zones that result in bottlenecks in some and under-utilization in others.

Voice offers much more flexibility to redeploy resources to match daily changes in overall workload on the warehouse floor. Furthermore, changing the configuration of a pick-to-light module can require additional changes to the light displays, communications backbone, and pick-to-light software as well as physical storage media and WMS changes. Reconfiguring pick modules supported by Voice is a much simpler proposition that generally only requires labeling in addition to storage media and WMS changes.

An engineering-based approach can be used to quantify other benefits. However, assumptions may have to be made in certain situations. In this case, sensitivity analysis can gauge the impact of varying these assumptions on the results. Knowing how the software application truly functions is critical to quantifying realistic benefits. It may also factor in the cost portion of the analysis in the event that software modifications are required. Appreciation of how Voice application software performs for any specific operations starts with a general understanding of its key technology components and integration to warehouse systems.

Key Takeaways:

  • Vendors use case study data to report improvements, but these reports must be considered in the context of before and after points, as well as the nature of the operation, product being handled, and the systems involved.

  • Productivity and accuracy are the cornerstones of a business case for Voice.

  • Using a pilot program to quantify the benefits of Voice may be too costly; instead, consider using case study data or engineer-based analysis as viable alternatives.

Technology Components & Application Integration

For the most part, Voice logistics solutions share a common hardware and software architecture approach. SAP users are the major exception. The exception is because SAP users’ IT integration infrastructure is comprised of SAP’s Netweaver middleware layer. SAP Netweaver is the main approach systems integrators use to design and author additional system integration between inventory, order management and warehouse functionality.

For many operations, Voice can be approached as a shrink-wrapped application in which vendor quoted costs and performance will typically match the results experienced. Once again, the SAP user is an anomaly. This is because each and every SAP user has a unique configuration, and thus, a one size fits all interface may not maximize the existing SAP IT investment.

Making assumptions about how Voice works (or any other data collection technology) in any particular situation – based either on generalizations or how the solution performs at other operations – creates the risk of unexpected and potentially unpleasant results during implementation. Organizations can minimize this risk by taking the time and effort to understand Voice’s basic components and how it interacts with other warehouse systems.

In many ways, Voice employs a similar technology infrastructure as RF scanning. It is a distributed technology that uses an 802.1 1x standard wireless LAN that supports communications between client mobile computers and backend servers. The mobile devices typically employ the Windows CE operating system. Client software running on these devices manages data presentation and input services, which in the case of Voice means speech recognition and text-to-speech functionality. Servers provide business application and database functionality, as well as Voice client administration.

Client Components

Within this distributed framework, Voice solutions can vary significantly in how these components function both from the client and server perspective. Client speech recognition components are either speaker-dependent or speaker-independent. Speaker- dependent requires users to train the system to recognize the specific nuances of their voices. This process involves the system prompting the user to repeat digits and terms. Individual voice templates are stored on a management server and downloaded to the mobile devices as needed. Generally, it takes 15-20 minutes for a user to create his/her voice template. During this time, the user is also trained on how to actual use the new voice device. It is recommended that this ‘training time’ be fully leveraged against each associate with his or her new mobile device and headset.

The speaker-independent approach does not require the user to train the system on specific nuances to their voice, but still requires the same approximate user training time. This is the method employed by voice-enabled telephone customer service applications, in which callers respond verbally to system prompts. Most Voice vendors support only one approach.

Those offering a speaker-dependent solution generally claim that the speaker- independent approach is not as dependable in recognizing responses in the relatively noisy environment of most warehouses. Also, the speaker-independent method can be challenged by regional dialect variances. Other factors such as headset quality can impact Voice performance and reliability in the warehouse.

Voice recognition works best in the warehouse when user responses are limited to short distinct phrases and digits. Location verification is generally done by repeating three-digit numeric check digits associated with each location. Lengthy responses can present challenges, both from recognition and performance perspectives. Voice may be an excellent tool for capturing check weights, but barcode scanning may be a better choice for recording serial numbers.

Voice solutions can also vary in how their client software interacts with backend application servers. Most employ operation specific programs or task files that are downloaded to the voice-enabled devices. Communication between the client and application servers is controlled by these code sets, as well as support for client-side functional processing.

Some Voice solutions use a model that dispenses with client-side voice-specific application code. These solutions treat voice as another input/output stream, no different than text displayed or entered on a handheld computer. Input and output mapping is handled by server-based processes. Client-based software handles the local presentation and data capture functions. Performance factors and application integration typically govern which approach is employed.

SAP Server Components

Voice clients communicate with backend servers for application processing and database services. While some functionality may reside on client devices, most data validation and processing logic occur on application servers. For SAP users, Netweaver is the integration layer that supports virtually all integration between data components, such as WMS, inventory management or order management.

Direct Interface, Standalone Approach or Netweaver Direct Interface

There are three basic approaches for integrating Voice: direct interface, standalone application or Direct Netweaver interface. Under direct interface, client software exchanges information in real time directly with the WMS or ERP. This is done through a predefined set of application programming interfaces or service messages that each side can use to send or receive data from the other side. A number of Voice vendors facilitate this approach by publishing a standard library of message transactions. While most top-tier WMS solutions support a direct Voice interface, many WMS packages do not. Moreover, WMS vendors that support a direct interface typically only do so for a single Voice vendor. For SAP, the preferred approach is a direct interface to Netweaver.

Most Voice vendors provide standalone application software capable of supporting core warehousing operations much like a lower-end WMS package. Order and inventory data is downloaded from the higher level host system. All transaction processing occurs on the standalone application, with resulting pick confirmation and inventory data uploaded to the host system. This approach provides the potential for WMS integration through a relatively limited set of interface points. For example, selection demand can be downloaded upon waving and selection responses uploaded after the transaction has been completed.

Building this batch-like interface may be more economical in certain situations than constructing a real-time direct interface. Furthermore, it allows operations using order or inventory management systems to take advantage of warehousing functionality that may be unavailable in their legacy systems. For example, implementing a Voice standalone application may allow an operation to move away from discrete order selection to a more efficient zone, batch, or multiple order selection process.

A direct interface is more attractive from a cost and performance basis if it is already supported by the WMS vendor. These interfaces provide “out-of-the-box” Voice functionality that requires no additional programming or development – provided the functionality meets the specific requirements of an operation. Generally, this is the case. But in certain situations, both Voice client and WMS must be modified to meet requirements. Care should be taken when comparing specific requirements to the base Voice functionality supported by a WMS vendor. It should never be taken for granted that WMS Voice and RF scanning functionality work exactly the same.

The direct Netweaver interface is preferred by SAP users who already have teams trained and educated on Netweaver. The other main advantage of this approach is that the SAP specific skills can be leveraged to rapidly create and support the Voice interface. Figure 6 below, provided by Vocollect, is an example of a vendor’s approach to offer a direct SAP Netweaver interface alternative.

Figure 6. Example of Direct Interface to SAP Netweaver from Vocollect

This approach uses the SAP Internet Transaction Server (SAP ITS) to connect Vocollect VoiceDirect ERP applications to the SAP infrastructure. The SAP ITS is integrated into the kernel of the SAP Netweaver Application Server. SAP ITS presents SAP WM or EWM data in the form of HTML templates and pages to a Java-based Protocol Translator (PT).

The SAP ITS has been designed by SAP to extend business applications to a web browser or the Internet, by converting SAP Dynpro screens into HTML format. SAP ITS provides web access for several SAP products including the SAP ERP and SAP Supplier Relationship Management (SRM). As part of the Vocollect VoiceDirect ERP solution, the mobile RF screens of SAP WM (i.e., LM05 and LM45) have been voice-enabled. HTML templates have been generated for these screens with voice tags to allow the Protocol Translator to read and translate the complete process flow.

Key Takeaways:

  • After selecting a Voice vendor solution, cost and performance can be impacted by the underlying architecture and integration to warehouse applications – a risk that can be mitigated by understanding Voice’s basic components and integration with other systems.

  • Client speech recognition components are either speaker-dependent (requiring users to train the system to recognize their speech) or speaker-independent (no training is necessary).

  • There are three approaches for integrating Voice: direct interface, standalone application and direct SAP Netweaver interface.

Conclusion: Moving Forward with Voice

Voice is not for every distribution center or warehouse. However, the benefits cited in numerous Voice case studies are real and may be obtainable for any individual operation. Voice has moved beyond cutting edge to become an established warehouse technology. Any distribution operation concerned with improving productivity, accuracy, and throughput should give the technology serious consideration.

This should start with the realization that Voice is not a mutually exclusive proposition in the warehouse. Many operations that use Voice employ other technologies such as RF scanning and pick-to-light. What it boils down to is selecting the right tool for the job. Managers of distribution operations need to approach any process or system improvement project from this perspective. Voice is merely one of the technology tools to be considered, and developing a sound business case that looks across available tools is a necessary first step.

Building a business case for Voice or any other technology in the warehouse requires careful delineation and quantification of benefits and costs. It entails an ability to detail current processes and requirements, map how these processes will change, and plan how requirements will be supported using the new technology. Some key factors to keep in mind when evaluating Voice for a particular warehouse operations are:

  • Keep the proper goal in mind – The objective of any evaluation is not to figure out how to get Voice into the warehouse. It is about selecting the best tool for the job.

  • Employ an evaluation approach appropriate to the situation – The details and depth needed for a successful evaluation are dependent on the current operation and systems. For example, an operation already using a WMS solution may want to consider using the package’s direct Voice interface in a particular pick module that is currently supported by RF scanning. Costs, application, and integration components in this situation are much more concise than a paper-based operation without a WMS that is being compelled to significantly expand its capacity. The former may be able to get a relatively high-level review, but the latter needs a comprehensive analysis.

  • Do your homework – Operations managers do not need to become experts in the technology to consider its use. However, anyone evaluating Voice needs to know enough about its usage, alternatives, benefits, components, cost structure, and integration to make an informed decision. While Voice and WMS vendors can provide guidance and support in developing a business case, any organization contemplating the technology must be prepared to critically challenge its applicability within its distribution center.

  • Put together the right team – Evaluating, implementing, and using Voice are multidisciplinary propositions. The success of any Voice evaluation project is contingent on putting together a cross-functional team that represents management, operations, and IT. Since it may entail a significant investment, finance may also be needed to help frame the business case approach. If adequate internal resources are not available or the evaluation is inherently complex, consider retaining the services of a third-party consultant.

  • Be realistic and above board – The ability to adequately state benefits and costs is the crux to any successful evaluation of a technology or system in the warehouse. However, assumptions and estimates are an inherent component of even the most structured evaluation process. No mater how scrupulous an organization is in its process, there is always the potential of some unknown factor compromising the end results. Some operations respond to this risk by being conservative on benefits and factoring in a contingency line item on costs. Others bracket minimum, expected, and optimistic savings/gains by benefit. Regardless of the approach employed, any operation evaluating the technology needs to occasionally step back and question whether the numbers being employed are realistic.

  • Treat your business case as living document – Be prepared to live by the business case you develop. Track its performance during implementation and beyond go-live. Measure whether the anticipated ROI was achieved in projected timeframe. Many organizations do not perform post go-live assessments of their systems projects for a variety of reasons. This is wrong. Even if a project has missed its mark, knowing the root causes for the situation can present an opportunity to change course.

The expansion of interest in Voice is not a fluke or hype. Voice has a real role to play within the warehouse and rapidly has become a mainstream technology. While it may not be viable in the near or even long term for many operations, many others stand to gain from its employment. The first step in this process is determining how it stacks up within the warehouse. Given the evolutionary aspect of Voice technology and applications, this is not a static proposition. If the technology is not a good fit today, it may be eminently viable tomorrow.

Contact Information

Tom Singer, Principal
Tompkins Associates
tsinger@tompkinsinc.com

About Tompkins Associates

Tompkins Associates transforms supply chains for profitable growth. An industry leader for more than 30 years, Tompkins designs and integrates value-based, end-to-end supply chain solutions that encompass growth and business strategy, global supply chain services, distribution operations, information technology, material handling integration, and benchmarking and best practices. The company is headquartered in Raleigh, NC. For more information, visit www.tompkinsinc.com.

References

Tompkins’ Supply Chain Consortium is the premier source for supply chain benchmarking and best practices knowledge. With more than 300 participating retail, manufacturing and wholesale/distribution companies, the Consortium sponsors a comprehensive repository of 17,000-plus benchmarks complemented by search capabilities, online analysis tools, topic forums and peer networking for supply chain executives and practitioners. The Consortium is led by the needs of its membership and an Advisory Board that includes executives from Campbell Soup Company, Hallmark Cards, Hewlett Packard, Ingram Micro, Kraft Foods, Miller-Coors, The Coca-Cola Company, Target, and True Value Hardware. To learn more about how your company can become a member of the Supply Chain Consortium, contact John Foley, 919-855-5461 or visit www.supplychainconsortium.com.