søndag den 29. juli 2012

Updated

The Blog has been updated with a new Poll Question

The Consultancy and 1-day course pages have had a brush-up with small videos.


Read more...

mandag den 22. august 2011

Blog pause

Dear blog reader

I am taking a pause from this blog in the near future. It is after all quite complete.


The background for the first blog posts was a course, "Introduction to EN 50126", I held for a broad group of employees in railway companies.

Immediately, after the first blog posts, I had positive expressions from colleagues in- an outside Europe. It whetted my appetite to continue - so thank you for your comments and mails!

At the present moment, I think the blog contains most significant subjects. All is collected in the "Quick Guide to EN50126".
Maybe, I will start again.

In the meantime, please take a 'Tour de safety management'.

Best regards

Troels Winther


Read more...

søndag den 2. januar 2011

Putting it all together


How do we grab the airy key concepts of EN 50126 / IEC 62278 and convert them into a well working Safety Management System?

 

Case 1: Small Supplier Company

A minor Developing house with twenty employees is producing a control circuit for industrial applications.
They realize that the control circuit is suited to control points in railway tracks, but the circuit has to be Safety Approved.
Firstly, a plan for converting the control circuit into a safety approved circuit is written in a living document, named the Safety plan.
A further investigation of the company shows that they already have an ISO certificate. This means most quality and configuration management are in place.
However, the audit also discloses that the company has one key software developer who keeps all source files on his own computer and most software decision are taken at informal meetings.
Nobody in the company, except the programmer, can tell how the software code works in details.
In order to fulfil EN 50216 / IEC 62278, the programmer is asked to make a System definition of the software, hardware and developing environment, read EN 50128 and make a flowchart of the code.
All interfaces to the system definition have to be described and the developing engineers are asked to write a document describing the Safety principles in the design (TR 50129).
A Hazard workshop is performed, describing all hazards that can arise, if the control circuit does not work as expected. Mitigating actions for the hazards is listed in a Hazard log and derivate Safety requirements are found.
The proof for fulfilling the Safety requirements and closing the hazards are written in a Safety Case.
The quality system is updated with change management procedures for changing functionality on the control circuit. The process includes Minutes of meetings, Responsibilities and Mandatory actions in each Phase.
The company already has parted developing and validating testing into to independent departments.
There is no need to change this organization; however a new procedure regarding mandatory education ensures that all current and future employees will have to participate in this course.
Finally, an external Assessor is hired to supervise the fulfilling of the Safety plan.
Basic concepts of EN 50126 are now implemented and the company is ready to meet the local Safety Authority.

Case 2: Major Operator

See "Quick Guide to Safety Management based on EN50126"

Case 3: The Cut-off Safety Authority

See "Quick Guide to Safety Management based on EN50126"

Next chapter >> 7.1 How are the standards produced?



Read more...

fredag den 12. februar 2010

Quantitative Risk Analysis


In some situations the qualitative risk analysis or the ALARP principle is insufficient: The safety people are torn and disagrees internally. Consequently, it is time to use the heavier "quantitative risk analysis"-tool.
The fault tree is integrated into Excel and models a scenario, where a passenger is trapped between closing doors. (All numbers and technical barriers are hypothetical).



Interpretation

The quantitative risk analysis is the right way to estimate the frequency of a hazard.
It removes personal obsessions from a safety problem and ensures that the discussions are conducted on an objective basis.
The fault tree above concerns a commuter fleet operating 365 days pr. year with 80 trains with 100 departures pr train pr. day. This result in c = 2.9E-06 departures every year pr. fleet.
In order to have an accident, there have to be squeezed a passenger arm, leg or items like e.g. a baby carriage, umbrella etc. between the closing doors. This is judged to happen continuously when passengers passes the doors, meaning d = 1.
There are three barriers that prevent the hazard:
A human based departure procedure, where the train driver looks out of the window and checks the doors before departure (e). It is estimated that the driver miss a check every 4'Th day due to distraction or lacking of concentration, meaning e = 1/(4*b).
There are also two technical functions:
- A traction blocking that prevents the train from driving if the door controllers indicate the doors are open (f). This function is part of the train computer and is expected to be reliable with a failure rate of 1 failure pr. 1,000,000 departures.
- A trap detection system in the door controller that prevents the passengers from being squeezed in a closing door (g). This function is sensitive to door mechanics; the FRACAS system indicates a failure rate of 1 failure pr. 10,000 departures.
As it can be seen we will end up having an accident where the train departs with a passenger trapped between doors every year. The Safety department has recorded an incident the recent year, indicating the fault tree is trustable.
Can we accept this? What are our quantitative acceptance criterion? It should be written and stated in the safety management system of the Operator.
The safety management now decides that the above result is unacceptable. We can only allow the hazard to occur every 10,000'End year.
A deeper analysis shows that the failures on the detection function only occurs for thin objects like a small child's arm.
It is judged that the detection system in the daily life is activated by large objects like a person; thin objects only occurs 3 times pr. day, meaning d = 3/b.
The sensors are adjusted and a maintenance program introduced; the following test result shows an improved reliability in the area of 1 failure pr. 100,000 departures (g).
An additional departure procedure is introduced stating the train conductor has to supervise the train doors before departure in front of a dedicated door. A new technical feature makes it possible to firstly close the other doors and finally, the train conductor enters the last door before departure. The improved procedure is expected to be more reliable with an estimated human failure rate of 1 failure pr. 10,000 departures (e).
These mitigating actions result in a dramatically lowering of the frequency of the hazard to app. 10,000 years between accidents, hereby fulfilling the acceptance criterion.
  

As a side effect, the analysis proves the importance of the departure procedure and the detection function.

The old rule, KISS, (Keep It Simple Stupid) is recommended for quantitative analysis. The fault trees easily swell up into large trees with several undocumented values based on engineering judgement. This only starts new discussions instead.

Next chapter >> 4.5 Common Cause Failures (CCF)

Focus on the Source

The "Guide to the application of EN 50126-1 for safety", TR 50126-2: Feb. 2007, concerns risk modelling and quantitative risk models.

Chapter 5.2, "Generic Risk Model" says:

Modelling predominantly represents a simplification and generalisation of reality but, enhances our understanding of causal relationships, highlights important factors and provides a useful tool for anticipation and potentially prediction of future.
A risk model may be created for a specific task (e.g., occurrence of a hazard, a combination of hazards, an operation, a sub-system, etc.) for a particular application or for a whole railway system by applying the risk assessment process to the relevant task or to the railway system.
[.....]
Developing a risk model for a whole railway system is a demanding task [....] the report does not recommend a single generic risk model for a whole railway system. [....]
Annex D lists essential steps for building such a model [....]


Read more...

Configuration Management


Configuration management concerns the task to be in control of documents and product configurations.
When a major project is running with full steam ahead, configuration management is a challenge.
However, if safety was a house then configuration management was the foundation.


Interpretation

The quality of the configuration management is an easy parameter to sense for an auditor.
In organizations with strong safety management (high SIL), the configuration management is pedantic and without a hitch: Documents, Minutes of meetings and Changes on the product are controlled in configuration management systems with fields for unique identity, date, responsible, revision, documents to be updated and tests performed etc.

In organizations with a lower safety culture the configuration management is random and uneven: Not all meetings have a minute of meeting or maybe you hear a busy employee stating: "I do not have time for making registrations!"
Such a statement indicates low awareness of the traceability requirements of safety decisions.
It takes commitment from the executives to change a low safety culture into a high concerning the configuration management issue.

Next chapter >> 4.2 Failure Reporting and Corrective Actions (FRACAS)

Focus on the Source

From chapter 3, "Definitions", in EN 50126

Configuration management: A discipline applying technical and administrative direction and surveillance to identify and document the functional and physical characteristics of a configuration item, control change to those characteristics, record and report change processing and implementation status and verify compliance with specified requirements.

From chapter 5.3.5, "Within all applications of this standard, the following requirements are mandatory":
...
e) an adequate and effective configuration management system shall be established and implemented...

From TR50126 Feb 2007, chapter 7.1.2
Change Management is seen as a crucial part of the LC Phases 11-13 [Operation], as emphasized in Table 7, column D: "…strict Configuration Control is THE most important issue…"

The subject is addressed to the maintenance of the QM and SM Systems of all involved perties.



Read more...