Failure is the First Step on the Road to Success, Part 2

Continued from Failure Is The First Step on the Road To Success, Part 1

Non-destructive testing overlaps to a certain degree with the next step in the process, wherein an analyst attempts to isolate the failure to as small of an area as possible. This phase of the project may include both destructive and non-destructive aspects as necessary to locate a defect site. Some problems may be fairly simple to isolate, given the correct tools; a low resistance short between nodes of a board may be revealed in a matter of seconds using a thermal imaging camera, and the aforementioned cracked solder joint found during visual inspection can usually be probed for continuity with very little trouble. Other defects may require patience, a steady hand, and a methodical plan of attack; finding a leakage site on a PCB, for example, may require an analyst to cut traces (both on the surface of the PCB and buried within) in order to limit the number of possible locations for a defect.

Once a potential defect site has been isolated, an analyst must be able to reveal the defect in all its glory. While the data gathered from isolation and non-destructive testing may be fairly strong, failure analysis follows the old clichés that “seeing is believing” and “a picture is worth a thousand words”; a failure analysis project is not truly finished until the analyst can produce images clearly showing a defect, removing any shadow of doubt that the anomaly found is at the heart of the reported problem.

This step is almost always destructive; the analyst must figuratively speaking, tear away the veil of FR4 and copper shielding the defect from view in order to definitively show the defect. At the assembly level, this often includes cross-sectioning (to show cracked vias and solder joints, or defects between PCB layers) or PCB delayering (to reveal damaged traces and voided or burnt dielectrics). Once the defect has been uncovered, an appropriate imaging solution can be chosen depending on the nature of the defect: high resolution optical or electron microscopes are sufficient for physical damage and defects, while tools like energy dispersive spectroscopy may be used to provide an “image” of contamination on a device that led to its early failure. With images in hand, an analyst’s work is almost finished.

In the final phase of a failure analysis project, an analyst must report their findings. The tools and techniques used by a failure analyst may not be familiar to their audience, who may be specialists in PCB assembly, metallurgy, or other disciplines. In some cases, the final audience of the report may be predisposed to disbelieve the results of an analysis (for example, in the case where the evidence shows that a subcontractor’s PCB’s do not meet required specifications, obligating them to re-run one or more lots of product). The failure analysis report must, therefore, be a clear, objective distillation of all data obtained during the course of the analysis, with a strong conclusion grounded in the facts revealed during the process. Whether the results point to a pervasive problem that must be remedied in order to meet reliability targets or are simply indicative of improper use by an end user, it is important to remember that the purpose of failure analysis is continuous improvement, not finger-pointing. Assigning blame does not offer a solution to a given problem; by understanding the nature of device failures, it is possible to implement corrective action (if necessary) to prevent recurrence of the same defect in future devices.

By following through the various steps of the failure analysis process – verification, NDT, isolation, revelation, and reporting – it is possible to take a device that would have been relegated to the trash can and transform it into a vital learning tool. It has been said that failure is the first step on the road to success; understanding the reason why a device has failed is a key starting point to creating a better device. Whether a defect was introduced during PCB manufacturing, solder reflow, or by an end user, all parties involved may learn from the anomaly and work to improve their own processes. While this article only provided a generic overview of the failure analysis flow, future articles will dive into further detail, exploring case studies that show the impact that failure analysis can have as well as exploring the techniques that go into a successful investigation. Until then, remember the motto of one of the most beloved groups of television scientists around – “Failure is always an option” – and keep an open mind to what that malfunctioning PCA might really be telling you!

Failure Is The First Step on the Road to Success – Part 1

It is an inexorable fact of life that all electronic assemblies – from the most complex, densely interconnected systems to the cheapest mass-produced consumer devices – will eventually fail. Such devices may be victims of various forms of abuse at the hands of their end users, subject to mechanical, environmental, or electrical stresses far beyond what any design engineer would consider reasonable. Some, especially early prototypes, may be inherently flawed and susceptible to malfunction as a result of a simple mistake made during one too many late night, bleary-eyed design review sessions, conducted over energy drinks and cold takeout. Of course, it is also possible for assemblies to simply die of old age; eventually, normal wear and tear will break down even the most robust of electronic devices. In all these cases, the result is the same (at least at a very high level): a device that no longer performs its intended function.

The process of dissecting the remains of a failed device or assembly, slicing through the tangled web of interconnects and dielectrics to get to the kernel of truth lying at the root cause of the malfunction, is called failure analysis. By utilizing an extensive set of specialized tools and techniques, failure analysts can go from a very basic initial observation (e.g. “no output on J8 pin-4”) to actionable data (“via misalignment between layer 1 and 3 on the trace from U9 pin-6 to J8 pin-4”). Traversing the gap between these two points requires a meticulously detailed approach. In future columns, we will delve into the intricacies of this approach by reviewing individual case studies and techniques, to better show how failure analysis can be applied within an existing manufacturing environment to improve product reliability; in this column, we will provide a high-level overview of what the failure analysis process looks like.

Due to the nature of failure analysis, no two projects will ever be quite the same. Failure modes, environmental conditions, device applications – all these parameters shape the circumstances of a given failure analysis project. Despite the fact that all failures are unique, there is still a generic process that can be applied to drive an investigation to its resolution. This process starts with the verification of the problem as reported – whether that report comes from a consumer, an entity further down the supply chain, or even from a test engineer directly after production, it is vital to verify that the issue can be recreated before attempting any further analysis. The verification phase of a project may be as simple as a five minute check with a multimeter, proving that the correct voltage is not outputting on a given pin or that continuity does not appear between two nodes that should be connected; in other cases, a more complicated approach may be necessary, such as when a failure is only present when a device is operated within a certain temperature range. Verifying the failure is not important only to prove that a problem exists; the verification phase also allows the failure analyst to determine the proper test conditions for later steps of the process.

Verification of the failure is the first of many steps in non-destructive testing (or NDT) of a device. As the name implies, non-destructive tests should have minimal impact on the sample under analysis; ideally, these tests should carry little to no risk of damaging the sample or potentially losing the defect. These tests will generally include a detailed visual inspection, looking for macro-scale defects like cracked solder joints or broken traces. X-ray inspection may reveal things that are buried within a circuit board or hidden underneath a component, like via misregistration or improperly wetted BGA balls. An acoustic microscope may reveal component level failures, such as package delamination inside a component that was not properly stored before undergoing reflow. Generally speaking, non-destructive tests are not sufficient to prove the root cause of failure on their own; however, they provide key data that will shape the course of the failure analysis project.

Electronic Device Failure Analysis – Printed Circuit Board Delayering

    If one were able to take a modern printed circuit board and examine the vast network of metal traces, completely unobscured by dielectric materials, one would find an intricate, three-dimensional lacework of finely interwoven metal threads. Thin filaments of copper, reminiscent of a spider’s web, snake outward from ring-shaped vias, while in other places metallic tributaries flow into the large bus lines which carry rushing rapids of electrons that provide power to the devices on the board. The many layers of the board taken as a whole bring to mind a futuristic highway system, with thousands upon thousands of individual pathways crossing over one another, routing traffic seamlessly from point to point. Unfortunately, this highway system is not always perfect; thin filaments may break, rushing rapids of electrons may overflow, and improperly built pathways eventually fail, turning these intricate patterns into tangled snarls sure to frustrate any user. In these cases, electronic device failure analysis can help to unravel the tangled web that was woven; one of many approaches that may be taken in these scenarios is printed circuit board delayering.

Printed circuit board delayering is an effective approach to electronic device failure analysis because it is one of a very few ways of examining a board as described above. PCB delayering is a close analogue to integrated circuit delayering; material is removed from the PCB using an abrasive polish, while an analyst maintains device planarity either through the use of specialized lapping equipment or, more commonly, finely calibrated fingers and a trained eye. The fiberglass weave and epoxy fill that make up the PCB dielectrics are generally removed in bulk with a polishing wheel and fine grit silicon carbide, with final touchup and spot polish performed by hand. Since metal lines are so tightly packed into small spaces, photographs of the delayered device must generally be augmented with optical microscopy; composite images of the board at relatively high magnification (between 50 and 200 times) are often required to show the level of detail necessary to find a defect. Different optical contrasting methods (e.g. bright-field and dark-field microscopy) are also useful, as some defects will appear much more clearly when viewed in a different light.

Since printed circuit board delayering is a technique for electronic device failure analysis, it is important to recognize the types of defects that it is best for uncovering. Cracked or otherwise damaged traces are perfect candidates for PCB delayering, since the delayering process not only reveals the crack in the trace but also allows complete access to nodes on both sides of the crack, allowing an analyst to electrically probe the device and prove that the crack is the root cause of failure, creating an open circuit. Similarly, faults between two different traces on the same layer of a PCB can be identified by delayering and probed in the same fashion. Defects caused by electrical overstress are also easy to find with delayering, as the dielectric material near the failure site will often be discolored (or even blackened and burnt) as a result of the failure, giving the analyst performing the work an easy target. Defects occurring between layers of the board are not good candidates for PCB delayering since the nature of the technique limits it to in-depth analysis of a single layer at a time; it is, therefore, vital to recognize the proper time for PCB delayering as opposed to other techniques.

The inherent nature of printed circuit board delayering makes it a tool for inspecting relatively broad areas, as opposed to a precisely targeted cross-sectional analysis. As such, it is often the preferred technique when focused isolation is not possible – open circuits, for example, are often much easier to find with delayering than with cross-section, since identifying the location of an open for cross-section in the absence of expensive time domain reflectometry equipment is difficult at best. Delayering is also indicated for large, distributed defects (e.g. a defect identified through thermal imaging as a large area of generalized heating as opposed to a pinpoint hot spot), since cross-section cannot generally capture such a defect in its entirety.

While printed circuit board delayering is a handy tool for electronic device failure analysis, there are other applications as well. PCB delayering can provide invaluable data when qualifying new processes or suppliers as an avenue for directly measuring process parameters to ensure that specifications have been met; PCB delayering is also useful for reverse engineering endeavors when patent infringement or other intellectual property concerns are suspected. Delayering is, therefore, an excellent addition to any analyst’s repertoire of tricks and techniques.

Failure Analysis of Electronic Assemblies – Investigating Solder Failures

Modern printed circuit assemblies are vastly complex labyrinths of interconnected devices, comprising many hundreds of components and thousands of individual signals being routed through the networks of metal, silicon, and dielectric material. While the individual integrated circuits on an assembly may steal most of the glory – just look at the buzz surrounding the processors inside the latest and greatest cell phone, video card, or supercomputer – interconnect technology is just as important to the success of a given product. To ensure a robust product, the reliability of the connections between individual components and the PCB that hosts them is paramount; to maximize this reliability, failure analysis of electronic assemblies to investigate solder failures is an excellent springboard to continuous improvement.

While failure analysis of electronic assemblies can take many forms, solder joint studies are incredibly valuable as they can provide a wealth of actionable data about the effectiveness of a given set of process parameters. Joint cracking or non-wetting, “head-in-pillow” type defects, device misalignment, and inter-metallic compound formation and characteristics are only a sample of the data that can be gleaned from this type of analysis; any of these data points provide invaluable insight into the reflow process, suggesting improvements that may be made (e.g. adjusting reflow profile times and temperatures, changing cleaning processes, even adjustments to the layout of the board to mitigate mechanical stresses). Capturing this information requires the proper application of principles and techniques; fortunately, there are a wide variety of approaches for diving into the intricacies of solder interconnects.

Perhaps the most basic (and one of the few non-destructive) approaches to failure analysis of solder interconnect is x-ray imaging. In only a few minutes of inspection time, an in-depth assessment of solder voiding and device registration can be performed. Given enough resolution, these tools are even capable of detecting gross defects (large joint cracks, missing balls on a BGA, and others) without doing any destructive analysis. Some of the most cutting edge tools can even perform tomographic scans and create three-dimensional reconstructions of an area of interest, which can then be parsed exactly as a doctor might review a patient’s CT scan to look for potential problems to target and analyze with surgical precision. Though x-ray is quick and non-destructive, it shares another similarity with medical imagery in that there is a certain degree of interpretation required to correctly identify defects. As a result, x-ray results are generally confirmed with further testing, usually destructive in nature.

One common destructive test that is generally performed on ball grid array (BGA) devices as part of an electronic assembly failure analysis where solder joint failure is suspected is dye penetrant testing. Dye penetration (colloquially referred to as “dye and pry”) involves immersing a suspect sample in a permanent dye, then subjecting the sample to vacuum to force the dye into all of the sample’s nooks and crannies. Following the vacuum exposure, the dye is allowed to dry, then the part is mechanically pried from the board, breaking all solder connections. Any air gaps that were exposed to the dye (for example, gaps left by cracked joints, head-in-pillow failures, or other non-wetting) will be readily apparent, as they will be visibly stained; good solder joints, conversely, will show no dye. Dye penetrant allows an analyst to map out all solder joints on a given device, showing trends that often translate into specific failure mechanisms (for example, large groups of cracked joints at the corners of a device may be an indicator of excessive mechanical stress). While dye penetrant allows a quick survey of all the joints on a given device, the depth of information gleaned is often limited to the location of the failing joint and where the joint failed (i.e. whether the joint cracked at an interface between solder and PCB, in the middle of the joint, etc.); for greater depth, other techniques are necessary.

Cross-sectional analysis is, in some senses, the complementary technique to dye penetrant testing. While the scope of the analysis is much more limited, with only a few joints visible in any given section, there is a much greater depth of detail that can be gleaned. After a sample is ground down and polished to reveal the joint in question, high-resolution microscopy can be used to characterize the joint and any failures therein. Thicknesses of inter-metallic compounds can be measured, solder phases can be identified, and defects like cracked or non-wetted joints can be photographed directly in-situ, rather than inferred after the fact from dye penetrant. Elemental analysis can even be used to study the admixtures of the various metals present. As mentioned, however, the cross-section is a much more targeted analysis than dye penetrant; an analyst must already know where the failure is before beginning the analysis.

Given the wide breadth of approaches to performing failure analysis of electronic assemblies where solder joint problems are suspected, choosing the proper approach is crucial in gathering actionable data. With proper planning and a methodical mindset, however, a successful investigation can generate a wealth of invaluable knowledge.

Electronics Failure Analysis Services – FIB Isolation and Editing

The focused ion beam (FIP) is a powerful tool in the hands of a skilled electronics failure analysis engineer.  In this post, we use the metaphor of a surgeon wielding a scalpel to help explain the power and versatility of the FIB system.

Imagine, if you will, a futuristic surgery theater. A surgeon sits before a sprawling bank of monitors showing images of a patient that has been magnified to sizes tens or hundreds of thousands of times larger than they would appear to her naked eye.

She rests her fingers delicately on a panel of knobs, sliders, and joysticks, carefully adjusting each input to calibrate her instruments. A few moments of fine-tuning and the target of her procedure crystallizes into focus: a microscopic defect that, despite its size, seriously threatens the patient’s life.

She quickly drafts a surgical plan on the image; instead of forceps or a scalpel, she brings a tightly focused particle beam to bear, making infinitesimally small, precisely placed incisions isolating the anomalous target from its surroundings.  A quick twiddle of the controls and the energy beam becomes a tool for reconstruction instead of excision. The surgeon deftly reroutes the affected parts of the patient’s anatomy to circumvent the faulty material, ensuring a full recovery.

What may seem to be a scene ripped from the annals of schlocky science fiction movie may ring truer than one might expect, given one key disclosure: our patient is not a living being, but is rather an integrated circuit, wrought in silicon and metal. Our surgeon’s tool was a focused ion beam (FIB) system; the procedure she executed was not a heart bypass or tumor biopsy, but an example of FIB isolation and editing, one of many electronics failure analysis services offered by Spirit.

The focused ion beam is an extremely versatile electronics failure analysis tool, one which greatly broadens the variety of electronics failure analysis services that can be performed.

The FIB is very similar to an electron microscope, in that it uses a beam of charged particles instead of focused light to create images of nanoscopically small objects. The FIB differs, however, in that this beam can also be used to remove integrated circuit (IC) material or to catalyze chemical reactions (for example, the disassociation of atoms of metal from a carrier molecule, and the subsequent adsorption of these atoms onto the surface of a microchip).

These properties allow the FIB to be used to cut and rewire an integrated circuit, just as our hypothetical surgeon did. Such operations will generally take place during the early debug phases of a product, to identify and eliminate errors in a design. The FIB may also be used during failure analysis to confirm the root cause of failure. For example, if an open circuit is identified as the root cause of a given malfunction, an analyst may reconnect the circuit at the site of the open and retest the sample, showing that once the open is remedied the part begins functioning normally.

The FIB is also a powerful tool for IC and PCB failure analysis in that it can be used to great effect in isolating a failure. The nature of the ion beam is such that charge contrast effects are generally far stronger than those seen in the electron microscope, which is a boon when searching for open circuits.

The FIB’s ability to cut traces is also invaluable for isolating potentially failing nodes on a die, with FIB probe pads created with the tool’s metal patterning capability providing an easy way to test the newly isolated features.

Sample preparation for inspection in other tools is also one of the many failure analysis services that can be performed with the FIB. Rather than performing mechanical cross-sections which can be time-consuming and risky when dealing with small defects, the FIB can be used to perform precisely targeted sections for analysis.

In many cases, FIB cross-sectioning can successfully prepare samples from materials that would be exceedingly difficult to accomplish with a traditional mechanical cross-section. For example, certain types of soft or flexible materials used in modern PCBs can prove problematic for traditional sectioning but are easily prepped in the FIB.

Creating thin lamellae for transmission electron microscopy is also greatly simplified by the use of FIB (though it is still by no means trivial, requiring several hours of time for a single sample). Seeing that the sample has reached the proper thickness for TEM work is much more direct in the FIB than with some mechanical preparation techniques.

Though the FIB’s capabilities (and perhaps even its name) sound as though they have been lifted from a failed script for a futuristic TV show, the focused ion beam allows electronic failure analysis labs like Spirit to greatly increase the electronics failure analysis services they can offer. The value of the FIB in performing isolation, editing, and sample preparation make it an invaluable tool, facilitating failure analysis of even the most cutting edge of technologies.

Analyzing Semiconductor Failures – From Evidence to Root Cause

The culminating moment of triumph for any failure analysis project is when a defect is captured in all its glory – that instant where the noisy tangle of data and observations are crystallized into a coherent analysis due to the addition of one crowning piece of evidence. While it would seem that the final photograph, showcasing the defect that lies at the root of a failure, would draw a failure analysis project to a close, there is often still work left to do; in many cases, analyzing semiconductor failures requires an even deeper examination of the defect, to determine its most likely origin.

When an analyst has finally uncovered the defect, there are still further questions to answer. Was the defect caused by an outside stimulus (e.g. mechanical or electrical overstress), or was it a pre-existing problem which was induced during the manufacturing process? Was the failing device an unfortunate victim of statistics, falling victim to the ill fortune of random process anomalies, or could it be indicative of a more pervasive issue? Without properly identifying the source of the defect, determining what corrective actions (if any) are necessary to prevent recurrence is nearly impossible. As such, the analyst must not take the defect at face value, but must consider many other data points to determine the root cause of failure.

One of the first things to consider when analyzing a semiconductor failure is the history of a device. In some cases, a device’s history will immediately identify the root cause of failure – for example, the root cause of a device failing after being subjected to ESD testing is, shockingly, almost always ESD. Failure analysis in these cases is often performed to determine which structures on the device were affected, so that improvements to a design can be made if necessary. In other cases, the history of a device may be used to determine which failure mechanisms can safely be excluded (or at least de-emphasized). For example, if a device has been operating in the field for several years before its failure, chances are very good that a processing defect was not the root cause of the device’s untimely demise. Consider a plot showing the probability of a device failing with respect to its operating time; generally speaking, this plot will follow a “bathtub curve”, with higher failure rates at the very beginning and very end of a device’s operating lifespan and a relatively low probability through between the two lifetime extremes. Early life failures are generally the result of processing issues; conversely, end-of-life failures often result from the wear and tear a device has been subjected to throughout the course of its life. If a device survives in the field for several years, it is extremely unlikely that a process defect was responsible for its failure – most of these devices would have failed far sooner – graphically speaking, process defects tend to fall at the left end of the bathtub curve.

The history of a device is not the only thing that must be taken into account when analyzing a semiconductor failure; the location of a defect on the device can also offer clues about the nature of the device’s failure. Damage consistent with electrical overstress can be interpreted much differently depending on where it falls on the die; blown ESD protection diodes may imply typical electrical overstress (e.g. a high voltage transient on an input pin) while damage in the device core, with no noticeable effect on the protection diodes, may be indicative of an inherent weakness in a device’s process. Defects occurring in high-field areas, like at the edges of diffusions or between nodes with high potential differences, are interpreted differently than those in areas without the added stress of a high e-field (a defect within a metal trace, for example).

While many defects require in-depth analysis to determine their root cause, other defects may speak for themselves. Excessive metal causing a short at the lower layers of a die can only be a processing error; a charred and blackened logic IC with a hole blown clear through the plastic encapsulant, on the other hand, is very unlikely to result from a processing defect, unless that processing defect was the inclusion of a low-order explosive instead of a silicon circuit – an improbable occurrence, indeed. While these examples are obviously extreme, there are many types of defects – fused bond wires, scratches on the die, and so on – that can be immediately correlated with a failure mechanism.

While it’s certainly true that the moment when an analyst can capture a defect with the perfect image may be the most exciting in a failure analysis project, the effort does not end there; in order to properly identify any corrective actions that must be taken, the defect must be correlated to its root cause. The value of a good failure analysis lab is the experience that enables accurate correlation of defects and causes – the ability to synthesize all the evidence gathered into a coherent theory about the life and death of a device.

Microelectronics Failure Analysis – Our Toolbox

At Spirit, we constantly strive to provide our customers with accurate, reliable data. We realize that our contribution to a given project may have far-reaching ramifications that continue long after we’ve sent our reports and finished our analyses. As the microelectronics failure analysis services we provide can be so vital in our customer’s process of continuous improvement, it is important to us that we ensure that our tools are up to the task of ferreting out the root cause of failure in even the most complex of devices. Many who are unfamiliar with FA are unaware of the types of tools that might be in an analyst’s repertoire; what follows is a brief overview of an analyst’s toolbox, all of which can be applied to increase understanding of a failure.

In microelectronics failure analysis, as in many things in life, seeing is believing; it should come as no surprise, then, that optical microscopy forms the backbone of many FA efforts. Careful inspection and documentation of a suspect device is critical for successful FA, and magnified optics allow analysts to identify defects that may be invisible to the naked eye. While traditional bright-field illuminated microscopy is still the cornerstone of any optical analysis, many analysts will perform false colorization of optical images, like that achieved through differential interference contrast (DIC) or light polarization, in order to reveal even further nuances of detail (for example, DIC imaging can often be used to identify damaged dielectrics on a semiconductor die). Some microscopes, like the ones used at Spirit, even offer a certain degree of automation in the photographic process, allowing analysts to acquire images with a high level of detail across an area much larger than traditionally possible with optical microscopy. Though the optical microscope is a cornerstone in any FA effort, it does have its limitations, chief amongst which is the inescapable physical limitation of diffraction: features smaller than half the wavelength of visible light cannot be resolved.

Given that many features on a semiconductor die are as much as an order of magnitude smaller than the wavelengths of light, it is necessary to use other tools to provide images of a device for failure analysis. Since the resolution limits of an optical microscope are imposed by the physical properties of light, the solution for sharper imaging was straightforward: use microscopes that do not use light to produce an image. The most obvious example of such a tool is the electron microscope, both in its transmission and scanning variants, which offer sub-nanometer resolution for identifying even the smallest of defects (for example, a gate oxide pinhole). Another example of an alternative imaging system is the atomic force microscope (AFM), which uses the forces exerted on a scanning probe by the surface of a sample to create a topographic image of a device (excellent for creating roughness profiles). Of course, once we are freed from the use of visible light to view a part, other types of imaging are possible that can reveal entirely different information about a part: x-ray and acoustic imaging both provide data about what is happening inside a packaged device, without disturbing any of the contents.

While imagery is indeed one of the key supports of a microelectronics failure analysis project, it is but one leg of the proverbial milking stool. Equally important are the tools used to confirm and isolate a given failure. Many of the tools used to confirm a failure would be immediately familiar to anyone with an electronics background – power supplies, oscilloscopes, logic analyzers, and the like. Isolation tools, on the other hand, are somewhat more specialized, and are not often seen outside of an FA lab. Tools like photoemission microscopy or Thermally Induced Voltage Alteration systems are used to identify semiconductor defects that may be invisible (at least initially) to most forms of inspection, while Time Domain Reflectometry or thermal imaging may be used to identify defects in a printed circuit assembly. Without first confirming a failure and isolating that failure to a given location, an analyst has no way to identify a site for optical inspection – or, even if a site has been identified, the analyst may not be able to positively correlate a given defect with the reported failure.

To continue with the analogy of the milking stool, the third leg supporting a microelectronics failure analysis project are the tools used for sample preparation. These tools may be chemical products used for stripping away plastic, metal, and oxide; they may be mechanical in nature, like precision CNC milling machines or ultra-fine polishing solutions; they may even sound more at home on the set of a science-fiction TV series than in a microelectronics lab, like plasma etchers that turn organic compounds into ash or focused ion beams which perform nanoscopic surgery on integrated circuits. These sample preparation instruments are a vital component in producing the data needed from an FA project; without proper sample prep, the results of an analysis are questionable at best (e.g., “Is that a defect, or a dust speck?”).

Though all the tools in the FA toolbox are important, the most vital of all is an experienced analyst who knows how to properly apply them; without the proper background, all the tools in the world will not produce an accurate, reliable analysis.

Uses for Auger Electron Spectroscopy

Modern consumer electronics devices must withstand all manner of harsh environments. They may operate in areas where humidity is extremely high, providing ample amounts of ambient moisture that can be detrimental to the operation of sensitive circuits. Many dirty environments are filled with dust, grime, and a whole laundry list of other contaminants ranging from the innocuous to the truly disgusting that can be pulled in by a device’s cooling fans, introducing myriad organic and inorganic contaminants that may collect on the surface of a device. Still other factors may exist that many designers may never even consider as a possible source for contamination; in one case, Spirit opened a device that had been returned from the field, only to find the inside thoroughly coated with the remains of unfortunate insects who had attempted a too-thorough inspection of the system’s fan. All of these things may contribute to the malfunction of an electronic device; however, it is up to the analyst to determine whether these contaminants or other environmental factors are truly at the root cause of the failure, or are merely incidental. Could ionic contamination, introduced from the environment, be causing a short circuit? Are the failing solder joints on a device the result of residual material left behind during board manufacturing? Fortunately, analysts have at their disposal tools which can help to understand the chemistry of failure; Auger spectroscopy is one such tool.

Auger spectroscopy is an elemental analysis tool, similar to x-ray fluorescence (XRF) or energy dispersive spectroscopy (EDS). A device is bombarded with a high energy electron beam, causing the electrons orbiting the atoms of a material to become excited and undergo transitions between stable and unstable states. Occasionally, this process will result in an electron being ejected from its atomic orbital. This ejected electron, which is called an Auger electron, has a unique energy level, determined by the element of the atom it was ejected from. By using a specialized detector, it is possible to measure these energy levels, which can be directly correlated to the element of the atom from which the electron originated; by analyzing a statistically large sample of these electrons, it is possible to generate an energy spectrum showing the elemental makeup of a sample. Unlike XRF and EDS (both of which are based on the analysis of x-rays), Auger spectroscopy returns data strictly about the surface of a sample, owing largely to the nature of electrons and their inability to penetrate large thicknesses of materials (relative to x-rays); depending on the type of analysis, this can be a very useful characteristic.

One of many applications for Auger spectroscopy failure analysis is looking for contaminants on a printed circuit board (PCB). Ionic contamination is one of the more prevalent causes of early-life failure of PCBs, causing metal corrosion, poor solder adhesion, and creating conductive pathways where none previously existed; however, many of the elements that constitute ionic contaminants also occur normally in the construction of the PCB, in the form of fire retardant, epoxy, and so on. Since these elements may be naturally present, it can be difficult to isolate contaminants from the sample bulk using techniques like EDS which have relatively large spot size and sample penetration; with Auger spectroscopy, however, an analyst can be relatively sure that data gathered is from the targeted area and not its surroundings. Similarly, the detection limit of Auger spectroscopy is lower than EDS, allowing smaller traces of contaminant to be identified.

Though its high degree of spatial resolution and sensitivity make Auger spectroscopy a perfect tool for analyzing contaminants, it can be used easily for many other purposes as well. Clients interested in knowing the exact composition of the various layers of a product – for example, as part of an intellectual property investigation – can benefit from the precision offered by Auger spectroscopy. Indeed, when considering the modern integrated circuit, comprised of multiple layers often no thicker than a few hundred nanometers, Auger spectroscopy is one of the few tools with enough precision to accurately identify the elemental makeup of the various features that comprise the circuit.

Auger spectroscopy’s adaptability makes it applicable to many different situations: whether characterizing contaminants or analyzing the constituent components of a device, the data provided by Auger analysis can be invaluable. The technique’s versatility makes it a worthwhile tool for any analyst; given the unpredictable nature of failure analysis, analysts always need a certain degree of flexibility and adaptability in their tools. It is important to remember, however, that data must be placed within proper context in order to be useful; the role of the failure analyst is to properly interpret the raw data, in order to truly pinpoint the root cause of failure for a given device.

Tips for Management of a Failure Analysis Project

Part of the inherent nature of failure analysis is the fact that no two jobs will ever be quite the same. Failure modes, environmental conditions, device applications – all these parameters shape the circumstances of a given failure analysis project. Managing a failure analysis project therefore requires particular care and attention, to ensure that the proper tools and techniques are chosen for a given job. Charting the course of a failure analysis project requires not only a solid grounding in the tests and equipment used in the lab, but also requires on-the-fly synthesis of disparate data points – not just the incoming data generated by the failure analysts, but also information about how and under what conditions a device was used before its failure.

One of the most important tips for a failure analysis manager is to secure a solid, open line of communication with the customer. While this may seem obvious since F/A is ultimately a service related business, the reasons for keeping in touch with the customer are manifold. The customer will almost always understand (or can get in contact with someone who understands) more about the application and history of a given device than a failure analyst, who in many cases sees a device for the first time after it has been turned into a twisted, charred lump of electronic creosote. By understanding the history of a device – for example, the manufacturing processes that a group of production rejects went through right before their failure, or the in-circuit application of a particular microchip that burned itself to a crisp – the failure analysis manager can make choices on how to direct the analysis based on the most likely mechanisms of failure. Communication with the customer is also important as part of the education process – it is highly likely that the failure analysis manager will be far more well versed in the analytical tools and techniques used to find the root cause of failure than a customer will be. By keeping communication open with the customer, a failure analysis manager can suggest the best course of action, potentially saving the customer money that would have otherwise gone to unnecessary testing.

Another important tip for failure analysis managers is to approach each analysis with an open mind, and avoid pigeonholing a given project due to preconceived notions. One of the worst possible situations to find oneself in as an F/A manager is performing “failure analysis by numbers” – following a laundry list of steps and procedures simply for the purpose of checking them off of the list, without any regard for whether or not the tests are producing valuable data. A failure analysis project is very much a closed feedback loop; the results of one test are invaluable in determining the next test or procedure in the line. It therefore behooves the failure analysis manager to be involved in all stages of the project, reviewing data at each step in order to better understand the failure and correctly choose the next test to perform.

Given the ever-changing nature of failure analysis projects, one of the most beneficial traits for failure analysis managers to cultivate is a willingness to improvise. The tools and techniques in an analyst’s toolbox are often sufficient – however, many jobs require test configurations above and beyond what might be considered typical. An analyst may need to draft a schematic and assemble a breadboard to perform rudimentary functional analysis of a complex device, or rig up a fixture for simultaneously providing mechanical stress and electrical analysis. Naturally, high value is placed upon ingenuity – a manager who can think on his or her feet to create a solution to a unique failure analysis problem using the resources at hand is an asset to any F/A service provider.

To sum up all the previous pointers, the best failure analysis managers are those who are willing to learn and adapt. Like any scientific endeavor, failure analysis is ultimately the pursuit of further knowledge – not only the knowledge of the reasons and mechanisms for device failure, but knowledge of techniques, test setups, and the state of the microelectronics industry in general. This thirst for knowledge is the defining trait of any failure analyst, and forms the basis upon which the F/A process is built.

Integrated Circuit (IC) Failure Analysis – Cutting Edge Trends

With the release of smaller, more feature-laden devices every year, it is obvious that the electronics industry is in a constant state of flux and evolution. The increase in complexity of a single integrated circuit over the years is undeniable, whether it is due to paradigm shifts in the methods of construction and operation or simply a result of the inexorable march of Moore’s law, which predicts that the number of transistors on integrated circuits will double roughly every two years.

Naturally, this constant change in technology has serious ramifications for failure analysis; a technique that was suitable for older products may not be sufficient for submicron technologies, with their densely-packed features and towering metal stacks. The failure analysis industry has therefore needed to respond quickly to changes in technology and develop new techniques capable of handling even the most complex of devices.

One of the most obvious effects the evolution of the integrated circuit has had on the electronic failure analysis process is the impact on inspection techniques. In many cases, a good optical microscope was sufficient for doing a large portion of an inspection; in some cases, even a detailed circuit extraction, a time-consuming process involving hours at the microscope, tracing out individual signals to get a complete picture of the construction and function of a device.

As device features continue to shrink, however, optical microscopy is no longer sufficient; wave diffraction creates an absolute limit for the resolution of optical microscopy at roughly one-half the wavelength of visible light. In practical terms, this means that features smaller than roughly 200 nanometers cannot be correctly resolved with optical microscopy.

Electron microscopy failure analysis is an obvious alternative, since the resolution limit for most modern tools is measured in tenths of nanometers – however, using electron microscopy for large-scale inspections, like those necessary for a circuit extraction, can be incredibly time and labor intensive, requiring a highly skilled operator and taking several hours in order to take all the necessary images for a successful inspection. As requests for this type of in-depth analysis have become an increasing trend in industry, it is necessary to find a way to meet the rising demand; Spirit’s solution to this issue is to automate as much of the process as possible; by using software developed by the manufacturers of our optical and electron microscopes, it is possible for us to define an area of interest, set key parameters for the microscope and image, then walk away, allowing the microscope to autonomously bear the burden of collecting the library of photographs necessary for the IC failure analysis service to be completed.

The benefits of this type of system are immediately apparent; customers can get datasets with far more detail than were previously possible, while the impact to Spirit’s workflow is much less than if a dedicated operator needed to man the tool for the whole inspection, allowing greater throughput (which in turn leads to happier customers, especially in an age where instantaneous information has become the expectation rather than the exception).

Another area in which the increased complexity of the integrated circuit failure analysis has caused issues is in the area of cross-sectional analysis. There are many trusted techniques for performing cross-sections that worked admirably for older devices: cleaved cross-sections, in which a device is scribed and broken at a site of interest, and mechanically ground cross-sections, in which abrasives are used to remove material from the die until the site of interest is reached, have both been successfully used for revealing defects for imaging. When the margin of error is expressed in nanometers as opposed to microns, however, these techniques begin to look almost barbaric – when the size of the defect is expected to be smaller than the finest polishing abrasive used in the cross-sectioning process, it is probably necessary to consider another approach. The problem of optical resolution is also convolved with the challenges of performing an integrated circuit cross-section; often, it will be necessary to target a single transistor, memory cell, or other similarly minuscule feature in order to uncover a defect.

The solution to this stymie can be found in the focused ion beam (FIB), a tool similar to an electron microscope that is capable not only of imaging a device but can also be used to remove material. Focused ion beams will often have built-in navigation systems that simplify the process of finding a cross-section site; an analyst need only load in the layout of the device (hopefully provided by an amicable, understanding customer) and perform a quick alignment, after which point it is possible to use the CAD model of the device to navigate. Cross-sectioning is as simple as defining the area of interest and creating an etch profile, then allowing the focused ion beam to slowly remove material until a defect has been uncovered. In reaction to the needs of the market, Spirit has placed an order for a Dual-Beam FIB (a FIB which also incorporates a high-resolution electron column for better imaging), which we will be taking delivery of in the beginning of 2013. We predict that this added capability will greatly increase our value to our customers and continue our reputation for fast, accurate analyses.

In essence, the overarching trend in integrated circuit (IC) failure analysis is that of increasing precision – more narrowly targeted isolation techniques, higher imaging resolution, and more accurately directed destructive methods are becoming necessary to work with modern devices. As a result, it is necessary to choose a lab committed to keeping up with the cutting edge technology that is becoming so prevalent, in order to help ensure the best outcome for a given analysis.