Noise related to PCB in WD HDDs (Part II)

2- The Continuous Noise

Sometimes there is a continuous noise come from WD HDDs mainly with L-shape PCBs
with motor ICs ( Smooth 1.3) , (L6278 1.7) & (L6278 1.2).

The noise is like: Trrrrrrrrrrrrrr or Trrrr….Trrrr…Trrrrr

so all we have to do for fixing this problem is:

1) – clean the connection points which connect the head stack pins with the PCB using a pencil Rubber …carefully.

2) – clean the motor IC pins thoroughly using a solvent & Toothbrush then wipe it with a piece of smooth handkerchief to remove the dust & dirt from it.
Note- the two steps mentioned above solve the problem in few cases.

3) – if the two steps mentioned above didn’t fix the problem , you have to replace the motor IC cause it’s damaged.
Note- in case of Motor ICs (L6278 1.7) & (L6278 1.2) first try to desolder them then resolder them again before u decide to replace them with a new ones … this sometimes work , but if it didn’t work … replace them directly.
– In case of Motor IC (Smooth 1.3) you must replace it directly.

The image below Shows you where to clean:

wd pcb

Read More

Noise related to PCB in WD HDDs (Part I)

Now discuss the Causes & Solutions of Two Main types of Noise which is occurred in WD HDDs (Especially Related To L-shape PCBs).

Clicking Noise and Continuous Noise:

1- The Clicking Noise
when you power on the hard drive u will hear a noise like (click,click….click,click…click,click)
this noise may be related to the head stack or PCB, the first thing you have to do is to check the PCB By The following steps:

1)- first you have to clean the Whole PCB With a Solvent & Toothbrush then wipe it with a piece of smooth handkerchief to remove the dust & dirt from it.
Caution: Cleaning of the PCB must be done carefully to avoid removal of any small electronic components.

2)- Check the Resistor (R120), [the right value of this Resistor is (0.12 Ohm)], you may adjust your multimeter to Resistor Measuring Mode to determine its Value, if its damaged you have to replace it. but before that, you have to check Transistor Q3 , it’s a 6 pins transistor , for measuring this transistor you may adjust your multimeter to Diode Mode,[ the right Value will be: (first two pins = 0.000 , second two pins =0.000 , Third two pins = nearly over 600)]
if Q3 is Damaged it will burn ur R120 after u replace it , so be sure that Q3 is ok before replacing R120 & u may also Check Transistor Q6 by the previous method to be completely sure it’s safe to replace R120.
Note: ( to be sure of The right values of these electronic components u may compare the values u have measured with the values of a working PCB’s Components)

3)- Check The Coils (such as L2 & L7) – adjust your multimeter to diode mode then the right value must be ( 0.000 ) for any coil as u all know.

4) – inspect the whole PCB for any removed component (such as small capacitors or Resistors) … the removal of these small components may occurred while forced cleaning of the PCB…. so be careful while cleaning it.

5) – in rare cases the firmware microchip may be damaged.

Read More

Tape services – Beyond Just Data Recovery

What happens when you need to access files from an old backup tape that is no longer compatible with your back up system, tape drive or backup software?

The rapidly changing world of IT means that new innovations are constantly replacing the latest technology. With changes to back up regimes, old tapes become redundant despite requests for old files to be restored. Furthermore, data compliance regulations require businesses to retain data for many years, often longer than the availability of the technology used to store it.

Causes of tape failure and data loss
•    Corruption – operational error, mishandling of the tape or accidental overwrites caused by inserting or partially formatting the wrong tape.
•    Physical damage – broken tapes, dirty drives, expired tapes and damage caused by fire, flood or other natural disaster
•    Software upgrades – inability for data on tape to be read by new application or servers

Tape recovery process
•    Tape recoveries are performed in dust-free cleanroom environments
•    Tapes and tape drives are carefully dismounted, examined and processed
•    Proprietary tools can “force” the drive to read around the bad area to recover your data successfully
•    Drives are imaged and a copy of the disk is created and transferred to new system

Read More

Storage Server Data Disasters – Common Scenarios (Part I)

When a data loss occurs on something as valuable as a server, it is essential to the life of your business to get back up and running as soon as possible.

Here is a sampling of specific types of disasters accompanied with actual engineering notes from recent Remote Data Recovery jobs:

Causes of partition/volume/file system corruption disasters:
•    Corrupted file system due to system crash
•    File system damaged to automatic volume repair utilities
•    File system corruption due partition/volume resizing utilities
•    Corrupt volume management settings

Case study
Severe damage to partition/volume information to Windows 2000 workstation; had used 3rd party recovery software – didn’t work, reinstalled OS but was looking for 2nd partition/volume, found it and it was a 100% recovery.
Evaluation time: 46 minutes (evaluation time represents the time it takes to evaluate the problem, make necessary file system changes to access data, and to report on all of the directories and files that can be recovered)

Causes of specific file error disasters:
•    Corrupted business system database; file system is fine
•    Corrupted message database; file system is fine
•    Corrupted user files

Case study
Windows 2000 server, volume repair tool damaged file system; target directories unavailable. Complete access to original files critical. Remote data recovery safely repaired volume; restored original data, 100% recovery.
Evaluation time: 20 minutes

Exchange 2000 server, severely corrupted information store; corruption cause unknown. Scanned information store file for valid user mailboxes, results took up to 48 hours due to the corruption. Backup was one month old/not valid for users.
Evaluation time: 96 hours (1.5 days)

Read More

Storage Server Data Disasters – Common Scenarios (Part II)

Possible causes of hardware related disasters:
•    Server hardware upgrades (storage controller firmware, BIOS, RAID firmware)
•    Expanding storage array capacity by adding larger drives to controller
•    Failed array controller
•    Failed drive on storage array
•    Multiple failed drives on storage array
•    Storage array failure but drives are working
•    Failed boot drive
•    Migration to new storage array system

Case study
Netware volume server, Traditional NWFS, failing hard drive made volume inaccessible; Netware would not mount volume. Errors on hard drive were not in the data area and drive was still functional. Copied all of the data to another volume; 100% recovery.
Evaluation time: 1 hour

Causes of software related disasters:
•    Business system software upgrades (service packs, patches to business system)
•    Anti-virus software deleted/truncated suspect file in error and data has been deleted, overwritten or both

Case study
Partial drive copy overwrite using third party tools, overwrite started and then crashed 1% into the process, found a large portion of the original data. Rebuilt file system, provided reports on recoverable data; customer will be requiring that we test some files to verify quality of recovery.
Evaluation time: 1 hour

Causes of user error disasters:
•    During a data loss disaster, restored backup data to exact location, thereby overwriting it
•    Deleted files
•    Overwritten operating system with reinstall of OS or application software

Case study
User’s machine had the OS reinstalled – restore CD was used; user looking for Outlook PST file. Searched for PST data through the drive because original file system completely overwritten. Found three potential files that might contain the user’s data, after using PST recovery tools we found one of those files to contain all of the user’s email; there were missing messages, majority of the messages/attachments came back.
Evaluation time: 5 hours

Causes of operating system related disasters:
•    Server OS upgrades (service packs, patches to OS)
•    Migration to different OS

Case study
Netware traditional, 2TB volume, damage to file system when trying to expand size of volume, repaired on drive, volume mountable. Evaluation time: 4 hours

When a data loss occurs on something as valuable as a server, it is essential to the life of your business to get back up and running as soon as possible.

Read More

Hardware Life Cycle Management(Part I)

hardware life cycleEvery IT professional can tell a horror story about an upgrade, roll-out, or migration gone awry. So many factors are involved; hardware, software, compatibility, timing, data, procedures, security protocols, and of course the well-meaning but imperfect human.

Over 2008, IT departments and staff can look forward to a number of upgrade projects for their computer system infrastructure. According to Gartner, Inc., the number of PC shipments during fourth quarter 2007 increased 13.1% over the same period in 2006. Global PC shipments during 2007 increased 13.4% over 2006 – equating to 271.2 million units in 2007.

While a slower economy than in previous years may lower the number of units, the fact that organizations have been investing in new units shows that Hardware Life-cycle Managementis still a mainstay of corporate IT’s responsibilities and will continue to be such.

IT professionals realize that scheduled change is a pattern for the industry. Whether this change involves accommodating new users, replacing old servers, or upgrading staff to newer systems, there is always change within the computer organization. Sometimes it is easy to only rely on hardware or software budgets for your roadmap. However, these budgets may be short-sighted and lack proper planning. Using accounting budgets alone to manage hardware may not take into consideration the overall life span of the equipment.

Equipment/software life-cycles and your road map
Managing IT equipment and product life-cycles is an important function of IT department staff. As a goal, equipment life-cycle management should reduce failures and data-loss because computer equipment is replaced before it fails, and it should reduce the total cost of equipment management over its lifetime. Depending on the organization, equipment life-cycles are based on different criteria.

•    Warranty expiration: If your IT infrastructure has a mix of equipment in place, with different makes and types of equipment, then your warranty-based product life-cycle management will be complicated. Using this approach is not only short-sighted, it also mirrors the first time you bought the equipment. Consider the expanding department who needs to plead with the CFO or budgetary manager for a non-planned equipment purchase. Three years later when the warranty expires, the department will be back again on their knees begging for replacements or an extension to the expiring warranties. Whichever the case, it will be an unplanned expense.

•    Waiting until equipment fails: In our economy, budgets are tight and management rightfully wants to get the most production or usage out of a piece of equipment before having to replace it. This approach is very risky and will usually cost more in the end.  IT equipment rarely fails at a “convenient” time.  If you’re lucky, the failure occurs during a slower period and your IT department is equipped to get you back up and running quickly.  In reality, this is not usually the case. Consider the real cost of equipment failure if it is month-end or year-end and the server with the financial data crashes; or a company has just secured a large contract and at the eleventh hour, one or more workstations fails or becomes intermittent causing wasted downtime on the project and inefficient use of personnel resources.

•    Capital expense budgets: Some IT departments base their product life-cycles on departmental accounting policies for capital expense purchases. Of course, this alternative method can have a knock-on effect when there is a business need for expansion and this wasn’t considered in the fiscal budget. Additionally, in larger user environments, departments may have control of their own capital expense budgets, so there may be many departments with different budget needs. When the life-cycle of one department’s equipment is complete, the number of fragmented purchases may actually reduce your company’s buying power. In contrast, a more structured approach would concentrate equipment purchases to various times throughout the year. This method is preferred by CFO or budget managers who will use a predefined purchase allocation per business unit or department to facilitate budget planning for the next year.

Read More

Hardware Life Cycle Management(Part II)

There are a number of financial planning exercises that can help you determine if capital expenses for PC hardware with complete parts and service contracts for the life of the unit are best suited for your IT infrastructure.

Alternatively, leased IT equipment may be more cost effective and would assist in maintaining a more comprehensive IT equipment life-cycle program.

As we dig further into this topic, you will see that hardware and software deployment planning is just the start of discussion for the IT group. Migration planning raises more questions than answers and these questions start with equipment and software life-cycle management. For example, planning discussions can start with these questions:

•    What is your IT department’s roadmap for equipment management?
•    How about the users you support, does your roadmap align with their needs?
•    What requirements have inter-company business owners or department managers contributed to the overall equipment management policy? Are any of the suggested requirements based on some of the above mentioned methods? (i.e., does the accounting department determine the life-cycle or does the OEM warranty determine the life-cycle, or is the policy just to “run the equipment into the ground”?)

Visualising the product map of the software your organisation uses and planning your major equipment purchases within a timeline helps structure your hardware retirement strategy.   By synchronising your hardware purchases with your software investment, you can minimise large capital expenditures and stagger departmental purchases so that you can qualify for volume discounts.

Additionally, if your organisation qualifies for specific licensing models, you may be able to plan your software purchasing on alternate years from your hardware purchasing. Take Microsoft’s core software products as an example (Fig. 1).

Figure 1: Recent Microsoft software product launches

hardware life cycle 2
It is tempting to think that only hardware equipment has a life-cycle, yet the above example clearly shows that software too has a life-cycle. Could your IT infrastructure benefit from synchronising your life-cycle management of both PC hardware units and software licenses? Where does your organisation envision product adoption and integration with respect to manufacturer rollout? Finally, does your PC hardware for servers, desktops, and laptops or notebooks align with or complement that vision?

Read More

Hardware Life Cycle Management(Part III)

Planning for a migration
Planning for product life-cycles necessitates an implementation strategy. Migration of computer systems has evolved from the manual process of a complete rebuild and then copying over the data files to an intelligent method of transferring the settings of a particular system and then the data files.

Many IT professionals can attest to the fact that there is a large investment in time and fine tuning of new servers. Whether it’s complexity of domain controllers, user and group policies, security policies, operating system patches, and additional services to users – all of these require time to set up. Fine tuning the server after the rollout can be time consuming as well. Once completed, a computer system administrator wants to have the confidence that the equipment and operating system are going to operate normally.

Thought needs to be given as well to the settings and other customization that users have done on their workstations. Some users are allowed to have a number of rights over their machine and can thus customize software installations, default file locations to alternate locations, or can have a number of programs that are unknown to the IT department. This can make a unilateral migration unsuccessful because of all of the unique user settings. The aftereffect is a disaster with users having missing software and data files, lost productivity as they re-customize their workstations, and worst of all, overwritten or lost files.

Deployment test labs are a must for migration preparation. A test lab should include, at a minimum, a domain controller, one or two sample production file servers, and enough workstations, sample data, and users to simulate a user environment. Virtualization software can assist with testing automated upgrades and migrations. The software tools to do the actual migration are varied – some are from operating system software vendors, others may be third party applications or enterprise software suites that provide other archiving functions. There are a number of documents and suggestions for migration techniques (some are listed in the references).

The success of a migration rests on analysis, planning, and testing before rolling out changes. For example, one company with over 28,000 employees has a very detailed migration plan for its users. The IT department used a lab, separate from the corporate network infrastructure, to test deployments and had a team working specifically on migration. The team had completed the test-lab phase of their plan and the migration was successful in that controlled environment.

The next phase was to roll out a test case on some of the smaller departments within the company.  The test case migration was scheduled to run automatically when the users logged in. The migration of the user computers to a new operating system started as planned. After the migration, the user computers automatically started downloading and installing software updates (a domain policy). Unfortunately, one of these updates had not been tested. The unexpected result was that user computers in the test case departments were inoperable.

Some of the users in the test case contacted the IT Help Desk for assistance. IT immediately started troubleshooting the operational issues of the problem without realizing that this was caused by a migration test case error. Other users in the department who felt technically savvy tried solving the problem themselves. This made matters worse when one user reformatted and reinstalled the operating system and overwrote a large portion of original data files.

Fortunately for this company, their plan was built in phases and had break-points along the way so that the success of the migration could be measured. The failure in this case was two-fold in that there were some domain policies that had not been implemented on test lab servers, and the effect of a migration plus the application of software updates had not been fully tested. The losses were serious for some users, yet minimal for the entire organization.

For other migration rollouts, the losses can be much more serious. For example, one company’s IT department created a logon script to apply software updates. However, an un-tested line of the script started a reinstall of the operating system. So as users were logging into their computers at the start of the week, most noticed that the startup was taking longer than usual. When they finally were able to access their computer desktop, they noticed that all of their user files and settings were gone.

The scripting problem was not seen during the test lab phase, IT staff said. Over 300 users were affected and nearly 100 computers required data recovery services.

This illustrates the importance of the planning and testing phases of a migration. Creating a test environment that mirrors the IT infrastructure will go a long way toward anticipating and fixing problems. But despite the most thought-out migration, the most experienced data professionals know that they can expect the unexpected. Where can you turn if your migration rollout results in a disaster?

Read More

Unique data protection schemes

Storage system manufacturers are pursuing unique ways of processing large amounts of data while still being able to provide redundancy in case of disaster. Some large SAN units incorporate intricate device block-level organization, essentially creating a low-level file system from the RAID perspective. Other SAN units have an internal block-level transaction log in place so that the control processor of the SAN is tracking all of the block-level writes to the individual disks. Using this transaction log, the SAN unit can recover from unexpected power failures or shutdowns.

Some computer scientists specializing in the storage system field are proposing adding more intelligence to the RAID array controller card so that it is ‘file system aware.’ This technology would provide more recoverability in case disaster struck, the goal being the storage array would become more self-healing.

Other ideas along these lines are to have a heterogeneous storage pool where multiple computers can access information without being dependant on a specific system’s file system. In organizations where there are multiple hardware and system platforms, a transparent file system will provide access to data regardless of what system wrote the data.

Other computer scientists are approaching the redundancy of the storage array quite differently. The RAID concept is in use on a vast number of systems, yet computer scientists and engineers are looking for new ways to provide better data protection in case of failure. The goals that drive this type of RAID development are data protection and redundancy without sacrificing performance.

Read More