Testing, testing... making complex microelectronics testable
TWI Bulletin, January - February 1998
Nihal is Principal Consultant in the Advanced Materials and Processing Department of TWI, providing technical skills in electronics technology and reliability.
Applying a design for testability policy early in the development cycle, improves quality, reliability and cost effectiveness of complex electronic components as Nihal Sinnadurai explains.
As integrated circuits (ICs) grow more complex, particularly those which are application specific (ASICs) and therefore at the leading edge, they pose increasing problems as a consequence of the need to provide compact and high performance interconnection into the outside world. Surface Mounting (SM) micropackaging was developed in the 1970s as an answer to the problem of delivering fully tested components into miniaturised complex circuit assemblies. Indeed SM packaging predominates in modern electronics for at least a decade straddling the millennium. However, the requirement for complexity moves inexorably on and new solutions to achieve compact complexity have resulted in the integration of many bare die ASICs integrated on high density substrates as Multi-Chip Modules (MCMs), which provide complete application specific functions.
Fig.1 MCM testability verses complexity
MCMs use bare-chip complex ICs which, if not fully tested prior to assembly in the module, will lead immediately to a poor yield of the MCM, and subsequently also pose reliability hazards because of their unknown performance and ageing characteristics. It is evident from the major campaigns to establish KGD (known good dice) delivery programmes, that the MCM community is alert to the inadequacies of wafer-probe testing prior to assembly in the modules. Conventional wafer-probe testing achieves rudimentary fault coverage. The consequence of inadequate testing is passed on to the assembled MCM, giving rise to a poor yield of the assembled module or an ill characterised device because the MCM may not be readily in-circuit tested and therefore diagnosed. Therefore MCM testability decreases with increasing complexity (Fig.1). MCM technology is essentially a hybrid solution to semiconductor integration. Yet some test solutions are more akin to archaic board level approaches. Some intermediate solutions are very complicated and add a number of process steps and potential damage to the KGD, for instance requiring temporary bonding into carrier packages for testing and burn-in and then de-mounting and rebonding into the final packaging or interconnection location. Such extra handling and processing add extra hazards and cost and do not fully represent eventual die performance. Such steps also contradict the aims of semiconductor foundries who iteratively test and improve their manufacturing practices to eliminate the need for burn-in. Therefore embedded generic solutions are required which commence at the source of the IC dice.
The Magnitude of the Problem
The burden of cost borne by design for test and actual testing in production has progressively increased (Fig.2), as a consequence of increasing complexity of monolithic ULSI ICs and the recent complicated approaches to KGD and MCM testing. The expectation is that the present inexorable trend towards increased costs will be eased as more intelligent and embedded approaches to testability are adopted.
Fig.2 Trends in IC complexity and test cost.
| 1970 | 1975 | 1980 | 1985 | 1990 | 1995 | 2000 |
| Complexity | SSI | MSI | LSI | VLSI | ULSI | MCM | MCM |
| Gate Count | 10 | 100 | 5k | 25k | 200k | 800k | 2000k |
| Memories | 256 | 1k | 16k | 256k | 16Mb | 256Mb | 10Gb |
| Transistors | 102 | 103 | 105 | 106 | 109 | 1012 | 1014 |
| Speed (Hz) | 100k | 1M | 10M | 30M | 100M | 300M | 500M |
| Pins | 14 | 28 | 44 | 128 | 356 | 600 | 1000 |
| Test/Total Cost% | 5% | 10% | 20% | 40% | 60% | 70% | 60% |
It is notable that the rate of the rise of test cost is already slowing, because the problem of design for testability is being addressed and embedded solutions are being delivered by many, but not all, manufacturers. However, the ideal solution of fully embedded solutions is not yet available, and therefore interim solutions are necessary. As the circuits become complex and are no longer assemblies of simple discrete ICs, they may no longer be probed to achieve in-circuit testing and diagnostics. Thus the testability of monolithic ICs and that of SM assemblies of micropackaged ULSI and other microcomponents have posed significant difficulties. Even today, in advance of the expected rapid advance of MCMs early in the next millennium (growth rate 19% p.a. and market exceeding $6 billion), the problems of USLI testability to achieve high fault coverage are quite severe and are solved either by partitioning, adding extra test pins, long test sequences or by reconfiguring the circuits to create sequentially testable paths.
Some earlier solutions to facilitate testing
Effective testing and testability requires the ability to configure the circuit to gain external control of ('controllability') and observe ('observability') all internal nodes, CAMELOT® (Computer Aided MEasure of LOgic Testability) - a development initiated by British Telecommunications in the nineteen eighties - is an effective testability measurement system which has since been incorporated into the HITEST® automated test pattern generation (ATPG) system. While there have been some modest interim solutions for VLSI testability, the assembly of the high pin-count micropackages with closely spaced terminals and the lack of through-hole connections, has presented serious obstacles to access for in-circuit testing to control and observe critical internal nodes of the circuits.
ICs do not lend themselves to mechanical in-circuit testing. Nevertheless, weaknesses in design or realisation have to be diagnosed and solved. As ICs became more complex, electronic access to internal nodes became difficult, and required corporate test strategies to ensure the design incorporated one or more testability features, such as partitioning, test access connections, breaking of loops, Level Sensitive Scan Design (LSSD). LSSD is a rigorous technique whereby every register in a circuit resides on, and can be configured into, a scan path separating complex combinatorial circuits into smaller blocks for ease of testing. The registers are configured into the scan path for testing together with the use of a scan input and a scan output, enabling a stimulus (controllability) pattern to be clocked onto the scan path and the response (observability) shifted out.
The benefits of LSSD are revealed by the dramatic improvement in automatic test generation (ATG) that can be achieved for, say, a 2000 gate IC within about 14 minutes for 99.9% fault coverage, contrasted with the laborious fault grading techniques which took over 50 hours to achieve about 60% fault coverage. Indeed today there are commercial CAE tools which combine logic simulation and testability measures to achieve ATG. Meanwhile, the problem of in-circuit testing of high density SM circuits using probes has been tackled by the development of special fixtures with 'pogo probes', or alternatively fixtureless robotic controlled moving probes, to gain access to the small contacts available on SMT assemblies.
The conflict for the designers has been the need to achieve maximum packing density and yet provide access points for the fine probes. Today probe resolution is typically as small as six microns and pad separation typically 100 microns. The contact pad options range from the generous use of extended test pads through to probing the leads of the micropackages (Figs 3-5). The presence of components on the same side as the probes adds complication to the design of the test assembly, requiring precise guide holes and the use of a physical stop to halt the travel of the probe bed. Probing the package leads is unsatisfactory because the leads can slope and their surfaces may add contact resistance. The problem is more severe with leadless packages and where designs incorporate buried vias. Therefore, an alternative philosophy for in-circuit testing of high density assemblies was needed and indeed has been delivered.
Fig.3 Probing extended test pads.
Fig.4 Probing standard surface mount pads.
Fig.5 Probing the package leads.
Emerging physical access solutions for KGD
Temporary connections for KGD assessment
Fig.6 Gold ball-on-ball wire bonding.
A useful comparison of die level burn-in (DLBI) carrier technologies was produced by Vasquez and Lindsey. The trade-offs are that permanent micropackages offer fully tested dice but take up more area on the substrate, whereassemi-permanent attachments require either force or energy (heat) to detach the tested chip, thereby delivering a stressed contact surface into the subsequent permanent joints with potential impairment to their reliability. An exampleof such temporary connection is that reported by Kim et al which uses gold ball-wedge bonding onto a burn-in PCB from which the chips are excised after burn-in by cutting all wires above the balls, which then serve as the bonding padsfor subsequent ball bonding, ie ball-on-ball (Fig.6). Thus the underlying ball is subjected twice to bonding stress which accelerates ageing of the joint with the IC bond pad. An alternative promising development bynChip, for their silicon burn-in substrate (SiBIS), incorporates compliant solder bumps for wafer-probe testing. The solder bumps on a silicon probe card are softened and distorted into ohmic contact with the IC bond pads on the wafer.The solder bump shapes are re-established by the reflow to achieve up to 1000 reuses. The sequence of bump-distort-reflow is illustrated in Fig.7. A corresponding force must have occurred on the wafer surface. Also, this forcewould progressively increase as the solder bumps become more crystalline with each reflow. Despite this critical comment, there is promise from this technology which may yet be developed into a manifold reuse technique and betranslated into a production technology.
Fig.7 Solder ball before bonding,
A third option is the use of temporary carriers (Fig.8), which add the hazard of introducing stress at the IC bond pads in order to ensure good ohmic contact for measurement and burn-in. The contact is initiated by deforming the bond pad to break through the native oxide layer requiring more force per contact than the normal wirebond. Another form is piercing contacts which used irregular sharp edged surface texture. A third form is to establish ohmic contact by scrubbing, which also causes damage to the bond pad surfaces.
Fig.8 Temporary die carrier.
As increasing proportions (of total IC sales) of bare dice are used in MCMs such solutions will continue to be explored, and add evidence for the need to produce effective non damaging methods of functional and in-circuit testing to deliver genuinely known good dice.
Combined physical and electronics solution to in-circuit testing
A practical solution to embedded testability has been formulated which applies to high complexity monolithic circuits, high density SM assemblies and MCMs, as a result of the initiatives by the Joint Test Action Group (JTAG). JTAG, formed by initiatives in Europe, produced and defined by the concept of 'Boundary Scan', which allows 'virtual probing' of internal nodes beyond the I/O buffer. The concept has since been embodied as the IEEE Standard 1149.1 and has attracted the interest of the major IC manufacturers in delivering VLSI and ULSI which conform to the testability standard. There is an alternative powerful technique developed by CrossCheck for high density logic ASICs by embedding a test point array in the design - ensuring that each node is located at an intersection of an x-y line - thereby providing access to all nodes. CrossCheck can deal with synchronous and asynchronous logic and covers bridging and stuck-at faults and transistor defects. Despite the promise, it has not emerged as the preferred option because of the required added interconnection complexity that has to be embedded in the IC and thereafter in the system.
Boundary Scan is now an established standard and already is being used to develop a testability hierarchy up to system level. This recognises the ongoing activity on IEEE standards 1149.X for equipment design (eg P1149.5 for communicating maintenance messages between field replaceable units in a system).
This section further examines the adaptation of various solutions that have been developed at the IC level and illustrates solutions that are being used in practice for MCMs.
An integrated approach
Modern design-to-test policies require the inclusion of appropriate test features and testing at the chip level as the key building blocks to board level testability, and also rigorous bare-board (substrate) testing to ensure the subsequent MCM testability is not compromised. Rigorous inspection includes high resolution wafer inspection facilitated today by Atomic Force Microscopy (AFM) inspection such as that by Technical Instrumentation Company, USA and Danish Micro Engineering. AFM provides 100 times better resolution than optical microscopy and immediate capture of information into a personal computer. An alternative fine resolution inspection system is Step and Scan by Nikon, which uses deep UV light (248nm) and fine steps with an alignment accuracy of 50nm, to scan fine lines across a wafer with resolution of better than 0.25 microns.
Integrated test strategies are used by a number of IC and original equipment manufacturers of MCMs, who no longer find the temporary packaging option acceptable for reasons of cost, quality and reliability. Electronics design automation (EDA) tools incorporate many of the tools to enable electronics design for testability. EDA assisted design-to-test begins at simulation to verify functionality and to develop and verify test stimuli, facilitating LSSD, built-in self-test (BIST), test ROMS, design partitioning and the addition of consequent extra pins and packaging.
For prototype testing of wafers without mechanical probing, there are already developments which input design data from the CAE system to control electron beam probing of IC chips to verify the design and diagnose faults. Such DFT methods were instrumental as early as the 80386 microprocessor development in enabling early delivery of a fully functional device and testing of production parts.
Boundary Scan for ICs
Fig.9 Boundary scan as applied to an integrated circuit.
The incorporation of Design for Testability (DFT) of systems starts with the critical components, namely the VLSI and USLI. Boundary Scan requires the incorporation of a 4-wire serial test bus comprising 'Test Data In' (TDI), 'Test Data Out' (TDO), 'Test Mode Select' (TMS), and 'Test Clock' (TCK), and a boundary scan register (Fig 9). The additional test circuitry consumes about 10% of the chip area. The technique takes full advantage of Scan Path design, described earlier, which today is no longer an interesting option, but an essential solution to the need to deliver tested and testable chips - it is the key to reducing overall product cost through significant increases in quality and reductions in production costs.
The IC test functions are controlled by a Test Access Port (TAP) controller which is a state machine communicating over the test bus. TDI and TMS are kept at logic-high unless otherwise driven. TDO is normally high impedance, and can acquire one of three states according to the data shifted through the IC. TMS initiates the state of the TAP, which selects the test mode. TCK clocks data into the IC through TDI and out through TDO. Either test instructions or test data can be scanned through the IC.
The addition of a little more internal 'tester' logic to the IC provides built-in self test (BIST). Such logic typically comprises a linear feedback shift register and a pseudo-random test pattern generator, thus minimising the stimulus and response vectors to be stored in the BIST circuitry. Because BIST uses the same types of transistors as the rest of the circuit, the tests can run at the maximum clock rate. The availability of Boundary Scan and BIST greatly improves the information gained, and simplifies the testing during wafer probing. By probing just a few of the IC pads and the four TAP terminals instead of probing all pads, and initiating the test routine, a significant portion of the logic can be exercised. For example, Vertex Semiconductors (San Jose, California) has achieved 99% fault coverage by accessing just 20 pads per chip, thereby avoiding cost and alignment problems of a 300 pin probe card.
In 1992 National Semiconductor (NS) introduced its bus-interface family for boundary scan testing. NS, unlike Texas Instruments (TI), located the TAP pins so that the same sockets may be used for ICs with or without boundary scan. NS placed TMS and TDI in the top row of the 56 lead SSOP (shrunk small outline package) and the TDO and TCK in the bottom row, whereas T1 placed all four pins in the bottom row. Meanwhile, Philips and Motorola proposed different pinouts for 64 terminal devices. Clearly there was the need for agreement on standard assignment of pinout for ICs conforming to P1149.1, now being commonly referred to as JTAG devices, named after its European originators. Pinout standardisation may be reached by agreement through JEDEC (Joint Electron Devices Engineering Council). The world-wide market for JTAG devices is estimated to have reached US$450 million by 1996 - a good incentive for the IC manufacturers.
Modelling approach to testing analog and mixed signal ICs
The problem of testability is different when testing analog circuits, because access is not the problem, duration is. Testing mixed signal devices, in particular analog-to-digital converters (ADC), can be very time consuming - testing all possible output codes of a 13 bit ADC requires 2 13 (8192) different values of input voltage. An effective alternative is to solve the independent equations from 13 sets of information obtained by measurements made at each binary exponent, thereby obtaining the data to calculate, rather than measure, all 8192 values. Hence, the skills of the test engineer may now be directed at defining the variables and the reduced set of test points and setting up the calculation program, in order to characterise fully the ADC. Commercially developed aids such as QR Factorisation (factoring a right(R) triangular matrix and an orthogonal (Q) matrix) are available to make machine solutions less subject to computer rounding errors. Where mixed signal circuits are more complex, then partitioning into individual testable segments and extra test access terminals are necessary to permit board level diagnostics, as described later.
Extension of testability techniques to circuit boards and MCMs
Board level testability
As Printed Circuit Boards (PCBs) and MCMs are built up from complex ICs, extension of logic testability requires that all the building blocks, such as the ICs and substrates, should incorporate P1149.1 capability or be adapted to take advantage of as many JTAG chips as can be used in the circuit. Ideally, the circuit design can expand the test bus throughout the PCB or the MCM (Fig 10). Testing is then effected through a board-level maintenance controller to facilitate production testing of the assembled PCB or MCM. Boundary scan at this level works by passing test data or instructions from TDO of one IC to TDI of the next, allowing test information to be clocked past the interconnected ICs and enabling access to each IC to be interrogated in turn. If the ICs were previously tested, then the same static vectors can be reused for testing the IC when it has been assembled on the PCB or MCM.
Fig.10 Boundary scan extended to circuit boards and MCMs.
The versatility of the technique is such that the testability can then be hierarchically escalated to the equipment level, to enable the PCBs to be tested and diagnosed at the system level in the field, by means of a system diagnostic bus. This would avoid the return of some 60% of boards from the field, subsequently found to be fault-free, saving considerably in logistics and cost. Standards are already emerging in this area, for instance the IEEE proposals:
- P896 FutureBus+ for modules, backplanes and chassis,
- P1386 COMBUS for a standard backplane interface bus protocol for high speed communications access to Synchronous Optical Network (SONET) and Synchronous Digital Hierarchy (SDH) based networks,
- P1149.x series for equipment design for testability.
A number of proprietary systems are already in operation by those leading the field, such as IBM's Diagnostic Expert for Testing (DEFT) and BT's Line Card diagnostics system. Both systems make use of field experience and expertise to build the rules for knowledge bases covering the range of faults identified. Thereafter, because the rules are the same for all users, and they are guided through the diagnoses, the users' skills are reinforced with use, and gets fed back into the knowledge base.
ATE is already available from many suppliers to test JTAG ICs and PCBs. IC manufacturers and ATE suppliers are also developing CAE design-to-test interface software. An associated important trend in CAE software is that, while there is justified emphasis today on hardware description language (HDL) for high level capture of circuit description, the emphasis is crossing over to get test designers involved in the earliest stages of design and to adopt HDL methodology. Already there are novel simulation tools which provide software 'breadboard' simulation of DSP applications by combining ASICs, standard device models, JTAG DFT functions and board timing delays.
LSSD and Boundary Scan solutions for MCMs
Volume production of MCMs is undertaken both by IC and system manufacturers. Motorola is a company which provides both IC and systems solutions, for which it has developed expertise in volume production of MCMs, including the essential test methodology. The individual chips incorporate special features for their testability in production as monolithic ICs. A solution consists of LSSD for random logic fault coverage greater than 95% of the individual chips, boundary scan corresponding to P1149.1, together with some non-standard P1149.1 instructions via TAP. Thus the need for additional sort die to invoke special test instructions is eliminated. In the example, high yield is achieved by a combination of a Main Control Chip and a smaller support chip which has high fault coverage (>99%) at wafer probe, which may be tested at high speed at 66MHz. The yield is then limited by the lesser fault coverage Control chip. MCM testing is effected by using the LSSD test control pins (multiplexed) to set the continued die function pins (databus and address) into a high impedance tri-state condition and deselect chips not being tested, while using the original pattern structure to test the selected chip with LSSD and functional patterns. The requirements for cost-effective MCM production are that there are no extra overheads in space, pin outs or test procedures.
Complex MCMs incorporate many VLSI chips built on costly multilayer substrates. Therefore, they must be designed with diagnostics testability to enable rework. Not all available VSLI incorporate P1149.1 testability features. In these circumstances, either the substrate can be designed to facilitate testability or additional circuitry is necessary. Today, complex MCMs are built onto active silicon substrates incorporating simple switching functions. Current testability developments, for example by NS, are exploiting this opportunity to build scan rings into such active substrates in order to access chips which do not have P1149.1 capability. A complex DRAM MCM incorporates P1149.1 featured buffer circuits on all Address and Control inputs and bidirectional input/output (I/O) data lines. Thus all MCM I/O terminals have P1149.1 registers connected in a scan path for board assembly and diagnostics. Although the RAM chips do not have P1149.1 features, such functions on the buffer internal ports provide for MCM assembly fault (wirebonds, tracks) diagnostics via an MCM P1149.1 TAP. The outcome comprises nine P1149.1 buffer chips to verify I/O and interconnect, six extra pin-out nodes connected to the DRAM RAS and CAS signals for DRAM diagnostics, plus fifty two test probe points for MCM internal diagnostics. A similar exploitation of partial population by P1149.1 ICs in an MCM comprises four P1149.1 featured ICs and two non conforming ICs, in which diagnosis was achieved by disabling the non-conforming ICs using terminations on MCM. Clearly practical solutions are therefore optimised using P1149.1 boundary scan chips where possible, with additional partitioning and test point access to provide diagnostics access into the assembled MCM.
At what stage of the product should testability be addressed?
Design to test
Design engineers may no longer 'Design for Test' (Fig.11) and throw their designs 'Over the Wall' to production test engineers. Instead, it is clear that design and production engineers have learned that they are part of the same team, that the wall has to be removed (Fig.12), and they now have to 'Design to Test'.
Cost of benefit of Design to Test
A new analysis for this paper has taken account of product lifecycles in today's marketplace (Fig.13). The analysis shortens the lifecycle to around 10 years from the previous 20 years and finds that there is a consequential marginal increase of the impact of design on subsequent lifecycle costs - that 72% of operation and maintenance costs are determined at the design stage of a product, a further 13% being influenced at the subsystem development stage. In other words, only 15% of operation and maintenance costs can be controlled by those responsible for operation and maintenance! Therefore Design to Test is crucial in controlling logistics and consequent costs of all systems operations.
Fig.13 Contributors to product life cycle costs.
| Glossary |
| ADC | Analogue to Digital Convertor |
| ATE | Automated Test Equipment |
| ATG | Automated Test Generation |
| ATPG | Automated Test Pattern Generation |
| BIST | Built-in Self Test |
| BT | British Telecommunications plc |
| CAE | Computer Aided Engineering |
| DEFT | Diagnostic Expert for Testing |
| DFT | Design for Testability |
| DLBI | Die Level Burn-in |
| DRAM | Dynamic RAM |
| DSP | Digital Signal Processor |
| EDA | Electronics Design Automation |
| HDL | Hardware Description Language |
| ICs | Integrated Circuits |
| JEDEC | Joint Electron Devices Engineering Council |
| JTAG | Joint Test Action Group |
| KGD | Known Good Dice |
| LSSD | Level Sensitive Scan Design |
| MCMs | Multi-chip Modules |
| node | Junction in a circuit |
| NS | National Semiconductor |
| PCB | Printed Circuit Board |
| RAM | Random Access Memory |
| ROM | Read Only Memory |
| SEM | Scanning Electron Microscope |
| SMT | Surface Mounted Technology |
| SSOP | Shrunk Small Outline Package |
| TAP | Test Access Port |
| TCK | Test Clock |
| TDI | Test Data in |
| TDO | Test Data out |
| TI | Texas Instruments |
| TMS | Test Mode Select |
| ULSI | Ultra Large Scale Integratation |
| VLSI | Very Large Scale Integration |