by Joseph Watson » Wed Jul 16, 2014 6:46 am
Hi Amit,
I am sure that you do not really expect that somebody is going to be able to tell you what is wrong with your system with only this meager information as a guide. You will have to supply much more information even to have someone get started with your issue. In the meantime, I have a few questions that you should be asking yourself.
Have you measured the time period until failure? Is it consistently the same length of time? At one point, you say it is 5 to 6 days. Then you say 4 to 5 days. Does it actually vary? Does it vary that much or does it just seem like it? Has it ever failed in a much shorter or longer period of time?
Are there software counters in your application that might take several days to overflow?
There are many possible sources of this problem. Does the PIC chip have proper decoupling capacitors? Does it have a stable power supply? Is it possible there are electrical noise spikes getting into your system somewhere such as electromagnetic impulses? Do the failures ever coincide with other external events such as thunderstorms or the starting or stopping of heavy industrial equipment? Have you ever been able to cause a failure by doing something?
Are there operator controls interfaced to the system? You did not mention any. Might failures be taking place when operator controls are used or when external sensors such as limit switches are operated?
You said there is an LED that you are blinking in response to Timer 0. Does the LED always stop in the On or in the Off state or does it vary?
Have you conducted any tests to help lead you to the problem? What tests did you perform? What were the results? Have you done anything to narrow down the cause of the problem?
Have you tried stopping the GPS data to see if the system still fails? Are there other aspects of your system that you could stop or run at different speeds or with different data to see what effect it causes?
Can you duplicate the problem on a second system or is there only one system available for testing?
How about having various key locations in your program set a value into some externally visible register? Then when it stops, looking at the contents of the register will at least give you some idea of where the program was when the system failed. Repeat the test several times and see if it is always failing in the same place or a variety of places.
Clearly, one distinct possibility is that the problem is data dependent. That is, the failure may only occur when certain combinations of data show up in your system and with just the right timing to trigger the failure.
Of course, there are far more ways to test a system. I am sure that others will have good questions for you and good tests to try as well.
I recommend that you go gather some information that will help you to locate the problem. Be as creative about devising tests as you are about creating the system in the first place.
Good luck.
NCR once refused to hire me because I was too short. I'm still waiting on my growth spurt.