WO2014146168A1 - Automatic detection of task transition - Google Patents

Automatic detection of task transition Download PDF

Info

Publication number
WO2014146168A1
WO2014146168A1 PCT/AU2014/000292 AU2014000292W WO2014146168A1 WO 2014146168 A1 WO2014146168 A1 WO 2014146168A1 AU 2014000292 W AU2014000292 W AU 2014000292W WO 2014146168 A1 WO2014146168 A1 WO 2014146168A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
user
change
measure
multiple consecutive
Prior art date
Application number
PCT/AU2014/000292
Other languages
French (fr)
Inventor
Julien Epps
Siyuan CHEN
Original Assignee
National Ict Australia Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2013900958A external-priority patent/AU2013900958A0/en
Application filed by National Ict Australia Limited filed Critical National Ict Australia Limited
Priority to AU2014234955A priority Critical patent/AU2014234955B2/en
Publication of WO2014146168A1 publication Critical patent/WO2014146168A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/19Sensors therefor

Definitions

  • This disclosure concerns the monitoring of a user performing tasks.
  • the invention concerns, but is not limited to, methods, systems, software and wearable devices for monitoring a user performing tasks.
  • Modern electronic devices provide an increasing number of applications.
  • the applications are integrated into the device and can operate simultaneously. Since most applications require user interaction from time to time, users of such devices are often faced with the problem of many applications competing for the user' s attention. As a result, the user is often distracted and unable to complete one task efficiently using one application before being interrupted by a different application.
  • a method for monitoring a user performing tasks comprising:
  • the determination of a task transition is based a measure of change of an eye feature.
  • the determination of task transitions is robust and efficiently achievable with low-cost equipment, such as web-cameras.
  • processing gaze direction is often based on the spatial relationship between the eye and the scene at which the user is looking, which means calibration of the camera with respect to the position of the eyeball is used.
  • additional relationships and calibration steps introduce complexity and inaccuracies into the calculations. Determining task transitioning based on a measure of change of an eye feature avoids these extra steps and is therefore more accurate and at the same time less complex to implement.
  • the disclosed method is independent of the user's environment and therefore particularly applicable where no further devices, such as electronic displays, are used in the user's environment.
  • the method may further comprise receiving one or more transition classification parameters that are based on the first measure of change during historical task transitions, wherein determining whether the user transitioned from a first task to a second task during the period of time is based on the transition classification parameters. It is an advantage that the task transitions are determined based on historical task transitions. As a result, the method can be trained or fine-tuned to particular circumstances or particular users by re-running the training procedure.
  • the transition classification parameters may be logistic regression parameters.
  • the method may further comprise sending a detection signal if it is determined that the user transitioned from the first task to the second task.
  • Sending the detection signal may comprise sending the detection signal to a data store to store the detection signal associated with a time value.
  • Sending the detection signal may comprise sending the detection signal to a computer system to trigger a response of the computer system.
  • the method may further comprise presenting a notification to the user if it is determined that the user transitioned from the first task to the second task.
  • the notification is presented when the user transitioned to a different task.
  • the notification is presented while the user is not focussing on the first task anymore and before the user is focussing on the second task. Therefore, the user is interrupted to a smaller degree than if the notification was presented during performing a task.
  • the multiple images may be frames of a video.
  • the method may further comprise selectively determining a second measure of change and determining perceptual load based on the second measure of change for the time period where it is determined that the user did not transition from a first task to a second task within that time period. It is an advantage that the perceptual load is determined when the user did not transition from a first task to a second task. As a result, the perceptual load is task specific and more accurate than if the perceptual load was measured during task transition or over a time period that partly covers two different tasks.
  • the method may further comprise selectively determining a third measure of change and determining cognitive load based on the third measure of change for the time period where it is determined that the user did not transition from a first task to a second task within that time period. It is an advantage that the cognitive load is determined when the user did not transition from a first task to a second task. As a result, the cognitive load is task specific and more accurate than if the cognitive load was measured during task transition or over a time period that partly covers two different tasks. Cognitive load may be selectively determined where perceptual load is below a threshold. It is an advantage that the method selectively only determines cognitive load where the perceptual load is below a threshold. Since it has been shown that cognitive load estimation is inaccurate under high perceptual load, this threshold results in a cognitive load that is not negatively influenced by high perceptual load.
  • the period of time may be 1 second.
  • the method may further comprise:
  • the method may further comprise repeating the steps of receiving, selecting, determining the value, determining the first measure of change and determining whether the user transitioned from a first task to a second task.
  • the period of time of a current repetition may be based on the determination whether the user transitioned from a first task to a second task in one or more preceding repetitions.
  • the period of time of a first repetition may overlap with the period of time of a second repetition immediately preceding or following the first repetition, such that one or more of the multiple consecutive images of first repetition are also included in the multiple consecutive image of the second repetition.
  • the period of time of the first repetition may overlap with the period of time of the second repetition by half the period of time, such that one half of the multiple consecutive images of first repetition are also included in the multiple consecutive image of the second repetition.
  • the processing time of one repetition of the method by a processor may be shorter than the period of time.
  • the number of the multiple consecutive images may be such that the processing time of one repetition of the method by a processor is shorter than the period of time.
  • Each of the eye features may represent a change of the eye over the multiple consecutive images.
  • the method may further comprise:
  • each ellipse defining an axis, wherein determining the value representing the eye feature is based on one or more of the axes of the one or more ellipses.
  • the fitting step may comprise fitting a first ellipse to the boundary and fitting a second ellipse to only a section of the boundary that is located below a dividing line through the pupil.
  • the first ellipse may define a minor axis and the second ellipse may define a major axis, and determining the measure of change may comprise detecting a blink if the major axis of the second ellipse is greater than twice the minor axis of the first ellipse.
  • Each of the first ellipse and the second ellipse may define a centre point; and determining the measure of change may be based on a difference between the centre point of the first ellipse and the centre point of the second ellipse.
  • the second ellipse may further define a major axis and determining the measure of change may comprise detecting a blink if the difference is less than half the major axis of the second ellipse.
  • the measure of change may be based on one or more of:
  • the measure of change may be based on one or more of:
  • the measure of change may be based on one or more of:
  • the measure of change may be based on an average spectral power over one or more the bands:
  • the measure of change may be based on one or more of:
  • the first measure of change may be based on:
  • the second measure of change may be based on:
  • the third measure of change may be based on:
  • a computer system for monitoring a user performing tasks comprising:
  • a wearable device for monitoring a user performing tasks comprising:
  • one or more cameras to capture over a period of time multiple consecutive images of an eye of the user
  • a feature extractor to determine a value representing an eye feature in each of the multiple consecutive images to obtain multiple consecutive values
  • an analyser to determine a first measure of change that characterises change of the eye feature over the multiple consecutive images based on the multiple consecutive values;
  • a detector to determine whether the user transitioned from a first task to a second task during the period of time based on the first measure of change
  • the wearable device may further comprise a light source to illuminate the eye of the user.
  • the light source may be an infra-red light source.
  • the light source may be a light emitting diode.
  • Fig. 1 illustrates a computer system for monitoring a user performing tasks.
  • Fig. 2 illustrates a method for monitoring the user performing tasks.
  • Fig. 3 illustrates an exemplary stream of images.
  • Fig. 4 illustrates the eye of the user in more detail.
  • Fig. 5 illustrates parts of a wearable device.
  • Fig. 1 illustrates a computer system 100 for monitoring a user 102 performing tasks.
  • the computer system 100 comprises computer 104 and camera 106.
  • Computer 104 includes a processor 114 connected to a program memory 116, a data memory 118, a communication port 120 and a user port 124.
  • a display device 126 displays a user interface 128 to the user 102.
  • Software stored on program memory 116 causes the processor 114 to perform the method in Fig. 2, that is, the processor 114 receives via the communication port 120 multiple images of an eye 132 of the user 102 captured over a period of time by camera 106, determines a value representing an eye feature of the eye 130 of user 102, determines a measure of change and determines whether the user transitioned between tasks.
  • the processor 1 14 receives data from data memory 118 as well as from the communications port 120 and the user port 124, which is connected to a display 126 that shows a user interface 128 to the user 102.
  • the processor 1 14 receives the multiple consecutive images from the camera 106, the data memory 1 18 or any other source via communications port 120, such as by using a proprietary or standard communications protocol, such as USB or firewire, a Wi-Fi network according to IEEE 802.1 1 or a LAN connection.
  • the Wi-Fi network may be a decentralised ad-hoc network, such that no dedicated management infrastructure, such as a router, is required.
  • Consecutive in this context means that the images may be ordered one behind another, such as by recording time. It is not limited to all images from the camera 106 but some images may be dropped or deleted. Further, it is not necessary that the images are received in the same consecutive order but the images may be received out of order and then accessed in the right order. Receiving multiple consecutive images also covers examples where processor 114 receives references to images and then accesses the images by requesting and receiving each individual image from a data store, such as a video server.
  • communications port 120 and user port 124 are shown as distinct entities, it is to be understood that any kind of data port may be used to receive data, such as a network connection, a memory interface, a pin of the chip package of processor 114, or logical ports, such as IP sockets or parameters of functions stored on program memory 116 and executed by processor 114. These parameters may be handled by- value or by- reference in the source code.
  • the processor 114 may receive the multiple consecutive images and other data through all these interfaces, which includes memory access of volatile memory, such as cache or RAM, or non-volatile memory, such as an optical disk drive, hard disk drive, storage server or cloud storage.
  • the computer system 100 may further be implemented within a cloud computing environment.
  • Fig. 1 includes a camera 106
  • any image capturing device may be used.
  • an optical image sensor with or without optical lenses may be used as well as other image capturing device, which may be useful for machine vision only.
  • monochromatic devices as well as colour devices may be used as camera 102.
  • infrared or hyperspectral cameras with a large number of bands, such as 1000, may be used.
  • the system 100 comprises a pair of safety glasses frames during tasks with two lightweight infrared web cameras mounted on it, pointing to each eye.
  • the two web cameras may comprise a light source to illuminate the eye 132 of the user 102, such as two infrared (IR) light emitting diodes (LEDs) soldered in each camera board.
  • the web cameras are connected to a computer 104 with USB 2.0 cables and videos of only eye images are recorded at 30 frames per second in AVI format. IR light is invisible to the eye and is therefore non-intrusive to participants, but produces the darkest pupil effect; hence the quality of the pupil image is improved.
  • the system may measure the eye length in millimetres.
  • this disclosure relates to perceptual load and cognitive load, in relation to some of the generic attributes of the task.
  • Perceptual load occurs when a user is required to perceive more different items, or the same items with more demands on individual identification (e.g. identify an item colour and position at the same time), particularly in a short period of time [11]. It is a rather passive process, and some items cannot be perceived when perceptual load is high [11]. As high perceptual load can exhaust the available capacity for information processing, the format of user interface 128 on display 126 needs to be carefully designed to reduce perceptual load [12,13]. Although perceptual load can be estimated by counting the number of items in a controlled environment (e.g. when the user is staring straight at a known application), it is not easy to estimate it when the items perceived are uncertain (e.g. when the user is mobile or focused on some other task).
  • cognitive load Another factor that influences user behaviour in interaction, which can be used to infer task characteristics, is cognitive load.
  • the concept of cognitive load has been used in a variety of fields that deal with the human mind interacting with some external stimulants.
  • the definition of cognitive load is slightly different in each field.
  • cognitive load refers to the total amount of mental activity imposed on working memory at any instance in time; while in ergonomics literature it is described as the portion of operator information processing capacity, or resources that are required to meet cognitive task demands.
  • cognitive load is defined as in the cognitive psychology literature.
  • Cognitive load is defined here as the mental effort or demand required to process information in working memory for a particular user to comprehend or learn some material, or complete some task [F.
  • Cognitive load is relative to both the user (i.e. their ability to process novel information) and the task being completed (i.e. complexity), at any single point in time. It is attributable to the limited capacity of a person's working memory and their ability to process novel information.
  • Cognitive load is a performance based measure and it is noted that a measure of the complexity of the task cannot measure a person's cognitive load. This is an important distinction. Two different people can perform the same task and each be under vastly different cognitive loads. While the complexity of the task may impact the cognitive load of a person, the complexity of the task is not in itself a measure of the cognitive load being experienced by a given individual. That is why measuring cognitive load is so important because things like task complexity (or workload, stress or attention span) are not a measure of cognitive load. This is shown clearly with a practical example - a first person may find a particular task easy and use little mental effort while the second person performing the same task exerts a lot of mental effort. In both cases, the complexity of the task has not changed. Therefore saying that a measure of the complexity of the task is also a measure of each person's cognitive load would simply be wrong as while the task is the same the cognitive load experienced by 2 different individuals is very different.
  • this experiment or other experiments may be used as a training procedure to determine a transition classification parameter.
  • the transition classification parameter can then be used to determine whether the user transitioned between two tasks.
  • the setup of the training procedure allows the annotation of time windows as task transitions.
  • a measure of change is then determined during the training procedure and a classifier is trained using the annotations and the determined measure of change.
  • the trained classifier is stored on data store 118 in form of transition classification parameters, such as by fitting a logistic regression model to the data.
  • the transition classifier is based on a logistic function and the parameters 3 ⁇ 4 and ⁇ are the transition classification parameters.
  • perception classification parameters and cognition classification parameters are determined and stored in data store in order to later determine perceptual load and cognitive load respectively.
  • Perceptual load was controlled by the number of stimuli required to be perceived in each task type.
  • the low perceptual load task was a mental arithmetic task, where four numbers sequentially replaced one of the ten 'X's (displayed in a circle on screen) in a 3-s interval, then participants used a mouse to select the correct answer from 10 displayed choices.
  • the medium perceptual load task was a spatial task, where all task stimuli were displayed simultaneously, including the direction labels of two compasses, 12 radio buttons (displayed in a circle) indicating 12 directions, and a pointer to one of the 12 directions of the inner compass (inside the circle). Participants were required to identify the extent of the two misaligned compasses and click the corresponding direction of the pointer in the outer compass (outside the circle).
  • the high perceptual load task was a search task, where 28 stimuli were not only simultaneously displayed in 28 boxes, which were evenly distributed in two rows and two columns, but also moved by one box every 3 s in one of the line directions: right, left, up and down.
  • the target number was displayed for 2 s at the beginning of the task, and participants were required to click as many targets as possible, although there were only four targets in each task.
  • Cognitive load was induced by increasing the memory set, or the degree of the interactivity of task stimuli.
  • task difficulty levels were regulated by the number of digits for addition (up to 3 digits) and carries produced by addition.
  • the difficulty level was increased when the spatial compatibility between the two compasses was reduced by removing more direction labels.
  • the difficulty from level 1 to 5 was increased by increasing the number of the target digits from 1 to 5. Each digit was selected from ⁇ 1,2,..., 9 ⁇ with minimum repetition. Distractors were selected to have similar digits as much as possible but in a different sequence, to prevent participants from only memorizing a few digits. Tasks were continuously switched from one to another until a message displayed on the screen to indicate an end, for a break. The arithmetic and spatial tasks were internally switched to the next task by a click to submit the answers, and the search task was externally switched, by ending automatically after 12 s. The executed task sequence was always one group of the three task types, followed immediately by another group of them.
  • the sequence of the group was randomized permutations of cognitive load levels, 1 to 5, in groups of three task types, with repetition, therefore 125 (5 x 5 x 5) groups of three were generated in total.
  • the randomized groups were then separated into six parts. The six parts were evenly allocated into two days. Before the three parts each day, there was a rating part where participants rated the task difficulty at the end of each task. Half of the participants began with the rating plus first three parts in the first day and half of them began with the rating plus other three parts. To make it more realistic in terms of luminance variation, we set the three types of task in different background colours, namely, white, grey and black for the spatial, arithmetic and search tasks respectively, since task switching is often associated with background change in real- world tasks.
  • Fig. 2 illustrates a method 200 performed by processor 114 for monitoring the user 102 performing tasks.
  • One characteristic of interest is task switching, which often occurs for a priority change and external interruptions [1]. It is an inevitable phenomenon that most people find it difficult to carry out multiple activities simultaneously because of their finite ability.
  • Tasks include a wide range of activities that the user 102 performs, such as reading an email on device 126 or driving a car. Switching between tasks may be switching between tasks that are both performed on device 126 or may be switching between tasks where only one of the tasks or no task is performed on device 126. It is noted here that the tasks need not necessarily be computing tasks. It might be e.g. the transition between talking to somebody else and reading a book. The key thing is that the computer 104 should be aware of this.
  • device 126 is not included in computer system 104, and task transitions are determined for other purposes such as sending and storing the task detection signal associated with a time value, such as a time stamp, on data store 118. While in one example the processor 1 14 stores the start or end time of a task as the time value, in other examples processor 114 stores the duration of the task as the time value. This way a log of the user's activity is created and the user 102 or another person, such as the user's supervisor, can re- visit the log to examine the user's activity. In one example, the user 102 is an air traffic controller and makes a mistake. At a later stage, the log can be examined and it can be determined whether the user's mistake is related to switching between tasks. Of course, the task detection signal may be stored on a data store that is remote from the computer 104 in Fig. 1.
  • the method commences by the processor 1 14 receiving 202 multiple consecutive images of the eye 132 of the user 102 captured over a period of time, such as a time window.
  • the processor 114 selects the multiple consecutive images from a received stream of images, such as a continuous video stream, such as by selecting all images within a time period, such as a time window.
  • Fig. 3 illustrates an exemplary stream of images 302.
  • the stream of images is a video with 30 full frames per second and the time window is Is.
  • Processor 114 therefore selects 30 frames and covers a time window 304 of Is.
  • the next time window i + 1 306 overlaps with the time window i 304 by half the window time, such as 50%.
  • Processor 1 14 employs an average filter 308 over 3 frames to eliminate noise in the multiple consecutive images 302.
  • half the window time is exactly 50%.
  • slight variations are also half the window time, in particular if the number of windows is odd and no exact half can be determined. As a result, one half can have less images than the other half.
  • the window immediately before the current window and the window immediately after the current window cover the entire current window. As a result, each image of the video stream is included in exactly two time windows.
  • the described window configuration is applicable for determining task transition, estimating perceptual load and cognitive load as well as for training the classifier based on training data.
  • Each frame comprises the eye 132 of the user 102, either partially of fully.
  • eye activity that is, change of the eye feature over the multiple consecutive images, for task monitoring has advantages over other psychophysiological and behavioural signals.
  • eye activity contains three classes of information in the pupillary response, such as dilation, blink and eye movement (saccade and fixation).
  • Processor 114 performs video processing in order to determine 204 a value representing an eye feature in each of the multiple consecutive images to obtain multiple consecutive values.
  • the eye feature is the pupil of the eye 132 and the value is a characteristic of the pupil of the eye and the further processing steps are based on that characteristic, such as the pupil visibility, dilation, diameter as described in more detail below.
  • other values representing eye features such as eyelid position, colour of the sclera or cornea may also be used.
  • the next step of method 200 in Fig. 2 is that processor 114 determines 206 a measure of change that characterises change of the eye feature, such as change of the pupil, over the multiple consecutive images based on the multiple consecutive values representing the pupil, such as multiple measurements of the pupil diameter.
  • the processor 114 determines three classes of measures of change that measure eye activity, that is, the measures of change characterise change of the pupil over the multiple consecutive images.
  • the pupil is located in the multiple consecutive images and ellipse fitting is employed to estimate the pupil shape.
  • the major axis length in pixels is measured as pupil size. Meanwhile, the minor axis length is recorded for the calculation of eccentricity with the major axis length, and the orientation of the ellipse is also recorded as a feature.
  • the ellipse centre position is logged as eye position for later fixation and saccade separation. Blinks are recorded when the pupil size is occluded by at least half.
  • the fitted ellipse is superimposed onto each video for visual inspection to ensure the true eye activity is correctly represented.
  • the axis lengths, orientation and eye position are linearly interpolated during blinks, and then passed through a median filter with length three frames to remove the noise caused by rapid eye movements. Pupil size is then converted to millimetres by the ratio of the true eye length and the eye length in the video.
  • Fig. 4 schematically illustrates the eye 132 of the user 102 in more detail.
  • the eye 132 comprises an eye ball, which is mostly covered by an eye lid 404 and the scull (not shown) of the user 102.
  • the eye 132 further comprises an iris 406 and a pupil 408 centrally located in the iris 406.
  • the iris 406 and the pupil 408 are half covered by the eye lid 404 since the eye 132 was captured during a blink.
  • processor 1 14 determines a boundary of the pupil 408, the result is the bottom half of the pupil bounded by the lid 404 at the top, which is shown in Fig. 4 by the solid line of the bottom section of pupil 408 and the solid line of lid 404.
  • Processor 114 fits a first ellipse 410 to the entire boundary of the pupil 408 and as a result, the first ellipse fits entirely within the boundary. Further, the processor 114 fits a second ellipse 412 only to the a section of the boundary that is below the line of the eye lid 404, that is, the part of the boundary that belongs to the pupil 406 itself and not the eye lid 404. As a result, the second ellipse 412 is typically larger than the detected boundary.
  • the fitting of the ellipses 410 and 412 comprises for each ellipse determining a centre point and major and minor axis lengths by minimising a least squares error function of distances of points on the boundary from the ellipses 410 and 412.
  • Processor 114 performs the two-ellipse fitting scheme for blink detection, with the rationale that the half bottom of the pupil boundary is less vulnerable to occlusions and the centre positions of the two ellipses are different during blinking. Therefore, the difference of the two centre positions is set to be half of the major axis length of the first ellipse as the blink threshold.
  • processor 114 detects a blink if 2 bi to save the computation of calculating the distance of the centres while similar performance is achieved.
  • bi is the minor axis of the first ellipse 410 using the whole boundary and is the major axis of the second ellipse 412 using the lower half of pupil boundary.
  • each frame is firstly converted into a binary image, and geometric and temporal criteria are used to find the pupil blob. Then the whole boundary and the half bottom of the boundary are located to fit ellipses. It is noted that this process is also performed in order to calculate pupil dilation as described below.
  • Fixation and saccade are separated by dispersion-based algorithms [25] using one degree of visual angle for at least 200 ms.
  • the fixation and saccade used in this example are the relative fixation and saccade to the head position.
  • a time span of 0.5 s is defined before the end of the previous task to 0.5 s after the beginning of the current task as a task transition.
  • 0.5 s is the maximum latency of an eye activity response, from a stimulus to brain and brain to eye activity, as indicated by neuroimaging studies [26]. Therefore, a 1-s time window with 50% overlap may be used in task transition detection. From the recorded task timeline, all time windows that crossed the instant of task beginning or end as task transitions in the ground truth are labelled.
  • processor 1 14 calculates a total of 29 measures of change, including statistics, from the time domain signals, frequency domain and entropy.
  • the measures of change are calculated from a time window of 1 s (1-s time window), and from a time window of 2 s (2-s time window), with 50% overlap.
  • the measures of change characterise the change of the eye over multiple images, such as frames or averaged frames, and therefore each measure of change is based on multiple consecutive values representing the eye feature, such as multiple measurements of the pupil diameter.
  • processor 114 calculates the average pupil size (PDmean), pupil size range (PDrang), pupil size standard deviation (PDstd), average pupil orientation (POmean), pupil orientation range (POrang), pupil orientation standard deviation (POstd), average pupil eccentricity (PDeccMean), pupil eccentricity range (PDeccRange), and pupil eccentricity standard deviation (PDeccStd), over the time window.
  • PDmean pupil size range
  • PDstd average pupil orientation
  • POrang pupil orientation range
  • POstd average pupil eccentricity
  • PeccMean average pupil eccentricity
  • PeccRange pupil eccentricity range
  • PeccStd pupil eccentricity standard deviation
  • perceptual load and cognitive load estimation the above measures are calculated after pupil baseline subtraction to reduce the light reflex effect.
  • the baselines are the average pupil size during the 0.5 s at the end of each task transition. For this it is assumed that each task transition had been perfectly detected. This means that perceptual load and cognitive load are only determined for time windows where no task transition is detected. Alternatively, perceptual load and cognitive load are determined but not provided to a receiver of the load estimate when a task transition is detected.
  • the estimates are provided to the receiver but the receiver is notified that the current estimate is inaccurate since the estimates were determined for a time window including a task transition.
  • the processor 114 determines whether the user transitioned between tasks and sends a detection signal such that the receiver can reject received estimates of the perceptual and cognitive load. Power spectral density from pupil time series may also be considered. Due to the short length of time window, processor 114 performs a parametric method for power spectrum estimation, which avoids spectral leakage and provides a better frequency resolution than FFT-based nonparametric methods.
  • the Burg method for the autoregressive (AR) model is used and the AR model order is set to the length of time window divided by three [27].
  • Blinks can be categorized into two types: functional blink, which only occurs two to four times per minute to moisturize the eyes, and non-functional blink.
  • Non-functional blink includes reflex blink (a protective response, e.g. to a puff of air), voluntary blink (a purposeful response depends on one's will) and endogenous blink (occurs unconsciously).
  • reflex blink a protective response, e.g. to a puff of air
  • voluntary blink a purposeful response depends on one's will
  • endogenous blink occurs unconsciously
  • endogenous blink which is centrally controlled and has strong links to cognition [9]. Therefore, for the purpose of cognitive load estimation, a task goal and knowledge of the task duration are helpful to exclude most non-cognitive blinks by selecting appropriate task time windows.
  • Blinks may be estimated by detecting points of zero pupil diameter (from eye tracker output), due to partial eye closure, from a sequence of frames.
  • the number of frames that are treated as blink with partial eye closure may be empirically decided because the exact non-occluded pupil size is unknown.
  • a more advanced technique is to calculate the eyelid distance using the active appearance model, where both eye shape and appearance variation are learned from training data, and an initialization procedure is usually required before detection [5].
  • Example techniques include template matching and detecting luminance change in the eye regions to infer blink occurrence from far-field web cameras [11].
  • Blink number (blkNum) is calculated during the time window.
  • Inter-blink interval (blklnterval) is the number of non-blink frames during the time window.
  • Blink duration per blink number (blkDurPerNum) is the number of blink frames divided by the total number of blink.
  • Blink product (blkProduct) is the multiplication of blink number per second and blink duration per second.
  • processor 1 14 uses four measures within each analysis time window.
  • Blink rate (blink per second) is calculated as the number of blinks occurred divided by the time window duration.
  • Blink duration per blink is the sum of blinks during the time window normalized by the total number of blinks occurred in that time.
  • Inter-blink interval per second is the sum of the number of non-blink frames during the time window normalized by the duration.
  • Processor 114 also calculates the blink product per second, which is the product of blink rate and blink duration per second.
  • the time window may be set as the task duration after removing the time span of 0.5 s after task beginning and 0.5 s before task end, so that during the time window, a user is actually engaged in tasks.
  • the time span from 0.5 s before the end of the previous task to 0.5 s after the beginning of the current task is defined as the task transition state, which involves task environment change at the beginning of the task and requires button clicking at the end.
  • Fixation and saccade measures may also be considered. From the time series, the processor 114 calculates the number of fixations (fixNum) during the time window, the fixation duration per fixation (fixDur), the standard deviation of the fixation duration (fixDurStd), and the first difference of fixation duration (fixDurDiff).
  • processor 1 14 uses an entropy-based feature on fixation (fixEntropy), where an empirical probability mass function is generated by normalising a sequence of fixation durations (in frames) by the total number of frames in the time window, and calculate the Shannon entropy from this.
  • processor 114 computes the average amplitude of saccade (sacAmp), the standard deviation of saccade amplitude (sacAmpStd) and the first difference of saccade amplitude (sacAmpDiff).
  • sacAmp the average amplitude of saccade
  • sacAmpStd the standard deviation of saccade amplitude
  • sacAmpDiff the first difference of saccade amplitude
  • processor 114 computes all measures from a 1-s time window, overlapped by 0.5 s, for task transition detection.
  • Sensitive features are initially selected using a pilot test, in which the data from the first two parts of each day is taken as the pilot training set (half of them are randomly selected as a validation set) and the data from the third part of each day as the pilot test set.
  • Kappa scores are obtained from the classification decisions and the annotated ground truth. Data from the fourth part of each day is reserved for later use (testing data).
  • the top 12 features ranked by the Kappa scores may be collected as a full set of sensitive features. Then, the best measures from the full set of sensitive measures and classifier parameters can be selected from training data to form the input of a classification system.
  • processor 1 14 applies a median filter of three frames to the classifier output to remove rapid fluctuations due to noise in task transition detection.
  • processor 114 determines the perceptual load (low/medium/high) every 1 s independently of cognitive load, using the measures of change derived from a 2-s time window, not including task transitions.
  • time windows of more than 2 s long may be considered in practice. The subsequent procedures are similar to those for task transition detection, but without median filtering.
  • the processor 114 segments the signals into task windows according to determined task transition, and aggregates the classifier outputs using majority vote during the task window [28].
  • Task-by-task Scheme From the estimated perceptual load levels above, processor 114 uses the label that won the majority vote from a task window to represent the task. Cognitive load estimation
  • the level of perceptual load of the task is the level of perceptual load of the task.
  • cognitive control typically occurs in situations of low perceptual load [11], which implies that cognitive load levels during high perceptual load tasks may not be distinguishable. Therefore, high perceptual load tasks may be excluded during model training.
  • the classification decision for cognitive load is available only when the perceptual load is not high and therefore, cognitive load is selectively determined where perceptual load is below a threshold.
  • perceptual load is determined in three levels low, medium and high.
  • the cognitive load is determined selectively where perceptual load is below medium, that is, in the low level.
  • the cognitive load is determined selectively where perceptual load is below high, that is, in the low or medium level.
  • processor 114 may prune the sensitive measures set to a number of subsets for further selection of measures.
  • the principle of the pruning is to contain two or three subsets where measures are from each class of eye activity and the subsets where measures are from different classes of eye activity. This is to minimize the negative influence on classification performance due to individual difference on sensitive measures. For example, some participants have very low blink rate, which may result in blink features losing effect.
  • measures are selected that carry the most information. This is based on measurements from one or more particular users.
  • the processor 114 may automatically chose the optimal measures for a particular user based on a training set, where time windows are annotated with a ground truth based on the experimental setup described earlier. This way, some of the time windows are labelled as task transitions, while others are labelled with different levels of perceptual or cognitive load.
  • processor 114 performs logistic regression for classification, for simplicity and to avoid over fitting, and employs a participant-dependent scheme.
  • processor 114 uses the Parzen Window or a Gaussian Mixture Model.
  • k-fold cross validation may not be used. It is noted here that for task transition detection, a yes/no decision is made by the classifier, while for perceptual and cognitive load estimation the classifier may distinguish between multiple levels of each load.
  • the same classifier, such as logistic regression is used while in other examples, different classifiers are used to detect task transition and to estimate perceptual and cognitive load.
  • a participant-dependent scheme is used and different models for the three classification tasks are trained, using the task duration as the time window.
  • a model for each load level or state is trained using blink features, and then a test task is classified according to the maximum posterior probability or least loss.
  • a 10-fold cross validation scheme may be used.
  • the data partitions are as follows: data from the first three parts of each day forms the training data. Among the training data, 50% is randomly selected as a validation data to tune not only the regularization parameter but also the best subset of the measures. Data from the fourth part of each day may be used as the testing data.
  • R Correlation coefficient
  • the absolute value of R in the ranges of 0-0.3, 0.3-0.7 and 0.7-1.0 can be considered to indicate weak, moderate and strong linear relationships respectively.
  • Cohen's Kappa may also be used, which measures the magnitude of classification agreement and takes into account chance agreement. Generally speaking, a Kappa score between 0.01 - 0.20 can be interpreted as having slight agreement, 0.21-0.40, fair agreement (above 0.20 is often interpreted as statistical significance), 0.41-0.60, moderate agreement and 0.61- 0.80, substantial agreement.
  • Table 1 shows the top 12 most sensitive measures of change for task transition. We can see that the average power in the frequency band of 0.5-1.0 Hz is the most sensitive measure. Meanwhile, all the sensitive measures are from two classes of eye activity, that is, pupillary response (*psd*) and blink (blk*). Measures of change from eye movements have little contribution to task transition.
  • blink measures were the most sensitivemeasures, followed by fixation and saccade (sac*, fix*).
  • the average blink number in low and medium perceptual load tasks is higher than that in high perceptual load task.
  • Pupillary response has less influence on perceptual load, since pupil size varies in a similar range in each perceptual load task.
  • the average pupil size is the most sensitive measure for cognitive load.
  • the top 12 sensitive measures are mainly from the classes of pupillary response (P*) and eye movement (sac*, fix*).
  • Measures for perceptual load are mainly from blink, fixation and saccade.
  • Blink, fixation and saccade are correlated with the number of items to be perceived [6,21]. This confirms the efficacy of using fixation duration to indicated perceptual load [12].
  • the major features are from pupil and eye movement, similarly to [22,24].
  • At least two classes of eye activity are capable to distinguish one of the task characteristics, which provide sufficient measures for selection to overcome individual differences on feature preference. For example, the blink rate may be too low to be used for particular users. However, by selecting measures from another class of eye activity, their classification performance achieved may be fair.
  • the method 200 of Fig. 2 may be repeated, that is, iterated, such that a video stream from the camera 106 is continuously processed in consecutive time windows and the user is continuously monitored.
  • a task monitoring system can analyse tasks using a 1-s window, which may be suitable for many real time applications, which means the processing time needed by processor 114 to process the images of the 1 s window is less than I s. As a result, the processor 114 finishes processing the window of a current repetition before the next window becomes available and no backlog occurs. If the processor has a different processing power, the number of images, which is directly related to the time period or size of the window and the frame rate, may be changed such that real time processing is possible.
  • the time period for estimating perceptual or cognitive load may extend from one detected task transition to the next detected task transition and may therefore span less than 1 s or more than 1 s, such that a single load estimate is determined for the entire time while the user 102 performs that task.
  • processor 114 determines only the task transitions while in other examples, processor 114 determines perceptual load based on the task transitions. In further examples, processor 114 determines cognitive load based on determined perceptual load and task transitions. As a result, the computational effort and therefore the number of frames or the size of each time window may be different for these different examples.
  • Processor 114 may process the data in a way that is suitable for real time implementation. For example, the classifier may be trained by the data from the first three parts of each day, where the optimum features and model parameters were tuned.
  • the processor 1 14 determines that a user transitioned from a first task to second task is in the last step 208 in Fig. 2, the processor sends a detection signal.
  • the processor 114 sends the detection signal to a computer system, such as the computer system that controls the display of the user interface 128 on the display device 126.
  • this computer system may be the same computer system as system 104 in Fig. 1.
  • the computer system is informed that the user transitioned between tasks and that it is now the best moment to present a notification to the user.
  • Such a notification can be an acoustical signal, such as a ring tone, or a visual alert, such as a blinking or appearing symbol on display device 126, such as a toast pop-up at the bottom of screen that informs the user about a new email or instant message.
  • a ring tone such as a ring tone
  • a visual alert such as a blinking or appearing symbol on display device 126, such as a toast pop-up at the bottom of screen that informs the user about a new email or instant message.
  • the computer system that receives the detection signal is a control system, such as an instrument panel in an airplane cockpit, and a notification, such as a status message about the airplane's systems is displayed to the pilot only when it is determined that the pilot transitioned from one task, such as operating the radio system, to another task, such as navigating the airplane.
  • a control system such as an instrument panel in an airplane cockpit
  • a notification such as a status message about the airplane's systems is displayed to the pilot only when it is determined that the pilot transitioned from one task, such as operating the radio system, to another task, such as navigating the airplane.
  • processor 114 may further send notifications related to perceptual or cognitive load. For example, processor 1 14 may generate a notification when the load is greater than a given threshold or may log the load periodically on data store 118 associated with a time value.
  • processor 114 may send the detection signal to a computer system to trigger a response of the computer system, such as displaying a notification to the user 102.
  • Fig. 5 illustrates parts of a wearable device 500 integrated into a frame of glasses (not shown). A user wears the device 500 and the device 500 monitors the user performing tasks.
  • the device 500 comprises the camera 106 from Fig. 1.
  • the device 500 further comprises a processing module 502 including a windowing module 504 connected to a feature extractor 506, which is in turn connected to an analyser 508.
  • the analyser 508 is connected to a task transition detector 510, a perceptual load estimator 512 and a cognitive load estimator 514.
  • the task detector 510, perceptual load estimator 512 and cognitive load estimator 514 are connected to output port 516.
  • the camera 106 When in use, the camera 106 captures a continuous stream of images of the eye of the user and provides the stream to the windowing module 504.
  • the windowing module selects multiple consecutive images from the continuous stream so that the multiple consecutive images cover a set period of time.
  • the windowing module 504 is integratedinto the camera, such that the camera captures over the period of time multiple consecutive images of an eye of the user and there is no need to select the images from a video stream.
  • the windowing module 504 or camera 106 provides the multiple consecutive images to the feature extractor 508 that analyses the multiple consecutive images and determines a value representing the pupil by detecting and measuring the pupil in the multiple consecutive images.
  • the feature extractor 506 passes the determined values, such as centre point coordinates and axes lengths to the analyser 508.
  • the feature extractor 506 also detects blinks and sends the blinks in form of time values or frame numbers to the feature extractor 508.
  • the feature extractor 506 provides a pulse on a signal line when a blink is detected.
  • the analyser 508 determines a measure of change, such as blink rate, based on the data received from the feature extractor 506.
  • the feature extractor 506 and the analyser 508 are combined to determine a measure of change directly from the multiple consecutive images as described earlier.
  • the measure of change is then sent to the task transition detector 510 that determines whether the user transitioned from a first task to a second task during the period of time based on the measure of change from the analyser 508.
  • the task transition detector 510 uses output port 516 to send a detection signal if the task transition detector 510 determined that the user transitioned from the first task to the second task.
  • the perceptual load estimator 512 estimates the perceptual load based on the measure of change determined by the analyser 508 and based on the task transition from the task transition detector 510.
  • the cognitive load estimator 514 estimates the cognitive load based on the measure of changes from the analyser 508, the perceptual load from the perceptual load estimator 512 and the task transition from the task transition detector 510.
  • Processor 114 may log users' tasks throughout the day and analyse them in any way to discover more about how they are using their day, how long they take on particular tasks, how much of a particular type of load they are experiencing, whether they are overloaded etc. in a similar way to other analytics applications, such as bike computers or wearable biomedical logging devices.
  • Processor 114 may automatically alert users if they are doing too much or too little of something in particular, with a view to modifying their behaviour. Interruption management
  • Processor 114 may notify users (using any application) of an incoming message/email/call/requirement/etc only during task breakpoints. This may reduce task error by 50%. Tele-assistance
  • wearable computing allows a user to share their view of their task (through the head-mounted outwards-facing camera) with a remote expert, who then guides them through their task or collaborates with them on it.
  • a remote expert it will be an advantage to receive an indication of when to interject to minimise distraction and maximise guidance utility for the user engaged in the task, which is difficult using existing solutions when the user is remote from the expert.
  • the term refers to determining the emotional state of the user. As explained above, measuring load continuously, that is, during task transitions, has little significance. The same applies to measuring emotion. Therefore, some of the methods disclosed here, particularly accurate pupil diameter measurement, may be applicable for measuring emotion.
  • the measure of changes, task transition, and perceptual and cognitive load may be used as biomarkers for diagnosis of mental diseases or measures of cognition, for computer games or task informatics from eye activity (post-hoc analysis).
  • Suitable computer readable media may include volatile (e.g. RAM) and/or non-volatile (e.g. ROM, disk) memory, carrier waves and transmission media.
  • Exemplary carrier waves may take the form of electrical, electromagnetic or optical signals conveying digital data steams along a local network or a publically accessible network such as the internet.

Abstract

This disclosure concerns the monitoring of a user performing tasks. A processor receives multiple consecutive images of an eye of the user captured over a period of time. In each of the multiple consecutive images the processor determines a value representing an eye feature to obtain multiple consecutive values. Based on these multiple consecutive values the processor determines a measure of change that characterises change of the eye feature over the multiple consecutive images. Finally, the processor determines whether the user transitioned from a first task to a second task during the period of time based on the measure of change. The determination of task transitions is robust and efficiently achievable with low-cost equipment, such as web-cameras. The disclosed method is independent of the user's environment and therefore particularly applicable where no further devices, such as electronic displays, are used in the user's environment.

Description

Title
Automatic detection of task transition
Technical Field
This disclosure concerns the monitoring of a user performing tasks. In particular the invention concerns, but is not limited to, methods, systems, software and wearable devices for monitoring a user performing tasks.
Background Art
Modern electronic devices provide an increasing number of applications. The applications are integrated into the device and can operate simultaneously. Since most applications require user interaction from time to time, users of such devices are often faced with the problem of many applications competing for the user' s attention. As a result, the user is often distracted and unable to complete one task efficiently using one application before being interrupted by a different application.
Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present disclosure as it existed before the priority date of each claim of this application.
Throughout this specification the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
Disclosure of Invention
In a first aspect there is provided a method for monitoring a user performing tasks, the method comprising:
(a) receiving multiple consecutive images of an eye of the user captured over a period of time;
(b) determining a value representing an eye feature in each of the multiple consecutive images to obtain multiple consecutive values;
(c) determining a first measure of change that characterises change of the eye feature over the multiple consecutive images based on the multiple consecutive values; and (d) determining whether the user transitioned from a first task to a second task during the period of time based on the first measure of change.
The determination of a task transition is based a measure of change of an eye feature. As a result, the determination of task transitions is robust and efficiently achievable with low-cost equipment, such as web-cameras. This is an advantage over other methods, such as methods using gaze direction. These methods often rely on further information about the environment. For example, processing gaze direction is often based on the spatial relationship between the eye and the scene at which the user is looking, which means calibration of the camera with respect to the position of the eyeball is used. These additional relationships and calibration steps introduce complexity and inaccuracies into the calculations. Determining task transitioning based on a measure of change of an eye feature avoids these extra steps and is therefore more accurate and at the same time less complex to implement. Further, the disclosed method is independent of the user's environment and therefore particularly applicable where no further devices, such as electronic displays, are used in the user's environment.
The method may further comprise receiving one or more transition classification parameters that are based on the first measure of change during historical task transitions, wherein determining whether the user transitioned from a first task to a second task during the period of time is based on the transition classification parameters. It is an advantage that the task transitions are determined based on historical task transitions. As a result, the method can be trained or fine-tuned to particular circumstances or particular users by re-running the training procedure.
The transition classification parameters may be logistic regression parameters.
The method may further comprise sending a detection signal if it is determined that the user transitioned from the first task to the second task.
Sending the detection signal may comprise sending the detection signal to a data store to store the detection signal associated with a time value. Sending the detection signal may comprise sending the detection signal to a computer system to trigger a response of the computer system.
The method may further comprise presenting a notification to the user if it is determined that the user transitioned from the first task to the second task.
It is an advantage that the notification is presented when the user transitioned to a different task. As a result, the notification is presented while the user is not focussing on the first task anymore and before the user is focussing on the second task. Therefore, the user is interrupted to a smaller degree than if the notification was presented during performing a task.
The multiple images may be frames of a video. The method may further comprise selectively determining a second measure of change and determining perceptual load based on the second measure of change for the time period where it is determined that the user did not transition from a first task to a second task within that time period. It is an advantage that the perceptual load is determined when the user did not transition from a first task to a second task. As a result, the perceptual load is task specific and more accurate than if the perceptual load was measured during task transition or over a time period that partly covers two different tasks. The method may further comprise selectively determining a third measure of change and determining cognitive load based on the third measure of change for the time period where it is determined that the user did not transition from a first task to a second task within that time period. It is an advantage that the cognitive load is determined when the user did not transition from a first task to a second task. As a result, the cognitive load is task specific and more accurate than if the cognitive load was measured during task transition or over a time period that partly covers two different tasks. Cognitive load may be selectively determined where perceptual load is below a threshold. It is an advantage that the method selectively only determines cognitive load where the perceptual load is below a threshold. Since it has been shown that cognitive load estimation is inaccurate under high perceptual load, this threshold results in a cognitive load that is not negatively influenced by high perceptual load.
The period of time may be 1 second.
The method may further comprise:
receiving a stream of images; and
selecting from the stream of images the multiple consecutive images based on the period of time.
The method may further comprise repeating the steps of receiving, selecting, determining the value, determining the first measure of change and determining whether the user transitioned from a first task to a second task.
The period of time of a current repetition may be based on the determination whether the user transitioned from a first task to a second task in one or more preceding repetitions.
The period of time of a first repetition may overlap with the period of time of a second repetition immediately preceding or following the first repetition, such that one or more of the multiple consecutive images of first repetition are also included in the multiple consecutive image of the second repetition.
The period of time of the first repetition may overlap with the period of time of the second repetition by half the period of time, such that one half of the multiple consecutive images of first repetition are also included in the multiple consecutive image of the second repetition.
The processing time of one repetition of the method by a processor may be shorter than the period of time. The number of the multiple consecutive images may be such that the processing time of one repetition of the method by a processor is shorter than the period of time. Each of the eye features may represent a change of the eye over the multiple consecutive images. The method may further comprise:
locating a boundary of the pupil of the eye in each of the multiple images; and fitting one or more ellipses to the boundary, each ellipse defining an axis, wherein determining the value representing the eye feature is based on one or more of the axes of the one or more ellipses.
The fitting step may comprise fitting a first ellipse to the boundary and fitting a second ellipse to only a section of the boundary that is located below a dividing line through the pupil. The first ellipse may define a minor axis and the second ellipse may define a major axis, and determining the measure of change may comprise detecting a blink if the major axis of the second ellipse is greater than twice the minor axis of the first ellipse.
Each of the first ellipse and the second ellipse may define a centre point; and determining the measure of change may be based on a difference between the centre point of the first ellipse and the centre point of the second ellipse.
The second ellipse may further define a major axis and determining the measure of change may comprise detecting a blink if the difference is less than half the major axis of the second ellipse.
The measure of change may be based on one or more of:
blink,
pupil dilation,
fixation, and
saccade.
The measure of change may be based on one or more of:
average pupil size,
pupil size range,
pupil size standard deviation, average pupil orientation,
pupil orientation range,
pupil orientation standard deviation,
average pupil eccentricity,
pupil eccentricity range, and
pupil eccentricity standard deviation.
The measure of change may be based on one or more of:
number of fixations during the period of time,
fixation duration per fixation,
standard deviation of the fixation duration,
first difference of fixation duration,
entropy-based feature on fixation,
average amplitude of saccade,
standard deviation of saccade amplitude, and
first difference of saccade amplitude.
The measure of change may be based on an average spectral power over one or more the bands:
O.lHz to 0.5Hz,
0.5Hz to 1.0Hz,
l.OHz to 1.5Hz,
1.5Hz to 2Hz,
2.0Hz to 2.5Hz,
2.5Hz to 3.0Hz,
3.0Hz to 3.5Hz, and
3.5Hz to 4.0Hz.
The measure of change may be based on one or more of:
blink number,
inter-blink interval,
blink duration per blink number,
blink product,
blink rate, and
blink duration per blink. The first measure of change may be based on:
an average spectral power over the bands
0.5Hz to 1.0Hz,
0.1 Hz to 0.5Hz,
l .OHz to 1.5Hz,
1.5Hz to 2Hz,
2.0Hz to 2.5Hz and
2.5Hz to 3.0Hz,
blink number,
pupil size range,
pupil size standard deviation,
blink duration per blink number,
blink product, and
inter-blink interval.
The second measure of change may be based on:
blink number,
blink duration per blink number,
inter-blink interval,
average amplitude of saccade,
blink product,
first difference of fixation duration,
standard deviation of the fixation duration,
number of fixations during the period of time,
fixation duration per fixation,
entropy-based feature on fixation, pupil eccentricity range and pupil eccentricity standard deviation.
The third measure of change may be based on:
average pupil size,
average amplitude of saccade,
pupil size standard deviation,
pupil size range,
standard deviation of saccade amplitude,
pupil eccentricity standard deviation,
pupil orientation standard deviation, standard deviation of the fixation duration,
average spectral power over the band 0.1 Hz to 0.5Hz,
first difference of saccade amplitude,
fixation duration per fixation and
blink number.
Tn a second aspect there is provided software, that when installed on a computer causes the computer to perform the method of the first aspect. In a third aspect there is provided a computer system for monitoring a user performing tasks, the computer system comprising:
an input port to receive multiple consecutive images of an eye of the user captured over a period of time; and
a processor
to determine a value representing an eye feature in each of the multiple consecutive images to obtain multiple consecutive values;
to determine a first a first measure of change that characterises change of the eye feature over the multiple consecutive images based on the multiple consecutive values, and
to determine whether the user transitioned from a first task to a second task during the period of time based on the first measure of change.
In a fourth aspect there is provided a wearable device for monitoring a user performing tasks, the wearable device comprising:
one or more cameras to capture over a period of time multiple consecutive images of an eye of the user;
a feature extractor to determine a value representing an eye feature in each of the multiple consecutive images to obtain multiple consecutive values;
an analyser to determine a first measure of change that characterises change of the eye feature over the multiple consecutive images based on the multiple consecutive values;;
a detector to determine whether the user transitioned from a first task to a second task during the period of time based on the first measure of change; and
an output port to send a detection signal if the detector determined that the user transitioned from the first task to the second task. The wearable device may further comprise a light source to illuminate the eye of the user.
The light source may be an infra-red light source.
The light source may be a light emitting diode.
Optional features described of any aspect, where appropriate, similarly apply to the other aspects also described here.
Brief Description of Drawings
An example will be described with reference to
Fig. 1 illustrates a computer system for monitoring a user performing tasks. Fig. 2 illustrates a method for monitoring the user performing tasks.
Fig. 3 illustrates an exemplary stream of images.
Fig. 4 illustrates the eye of the user in more detail.
Fig. 5 illustrates parts of a wearable device.
Best Mode for Carrying Out the Invention
Fig. 1 illustrates a computer system 100 for monitoring a user 102 performing tasks. The computer system 100 comprises computer 104 and camera 106. Computer 104 includes a processor 114 connected to a program memory 116, a data memory 118, a communication port 120 and a user port 124. A display device 126 displays a user interface 128 to the user 102. Software stored on program memory 116 causes the processor 114 to perform the method in Fig. 2, that is, the processor 114 receives via the communication port 120 multiple images of an eye 132 of the user 102 captured over a period of time by camera 106, determines a value representing an eye feature of the eye 130 of user 102, determines a measure of change and determines whether the user transitioned between tasks.
Although the computer 104 is shown separately from the camera 106, it is to be understood that the computer 104 may be integrated into the camera or even into a single chip with the image sensor. In another example, the computer system 100 is integrated into a wearable device, such as a pair of glasses. The processor 1 14 receives data from data memory 118 as well as from the communications port 120 and the user port 124, which is connected to a display 126 that shows a user interface 128 to the user 102. The processor 1 14 receives the multiple consecutive images from the camera 106, the data memory 1 18 or any other source via communications port 120, such as by using a proprietary or standard communications protocol, such as USB or firewire, a Wi-Fi network according to IEEE 802.1 1 or a LAN connection. The Wi-Fi network may be a decentralised ad-hoc network, such that no dedicated management infrastructure, such as a router, is required.
Consecutive in this context means that the images may be ordered one behind another, such as by recording time. It is not limited to all images from the camera 106 but some images may be dropped or deleted. Further, it is not necessary that the images are received in the same consecutive order but the images may be received out of order and then accessed in the right order. Receiving multiple consecutive images also covers examples where processor 114 receives references to images and then accesses the images by requesting and receiving each individual image from a data store, such as a video server. Although communications port 120 and user port 124 are shown as distinct entities, it is to be understood that any kind of data port may be used to receive data, such as a network connection, a memory interface, a pin of the chip package of processor 114, or logical ports, such as IP sockets or parameters of functions stored on program memory 116 and executed by processor 114. These parameters may be handled by- value or by- reference in the source code. The processor 114 may receive the multiple consecutive images and other data through all these interfaces, which includes memory access of volatile memory, such as cache or RAM, or non-volatile memory, such as an optical disk drive, hard disk drive, storage server or cloud storage. The computer system 100 may further be implemented within a cloud computing environment.
While the example in Fig. 1 includes a camera 106, it is to be understood that any image capturing device may be used. In particular, an optical image sensor with or without optical lenses may be used as well as other image capturing device, which may be useful for machine vision only. Further, monochromatic devices as well as colour devices may be used as camera 102. Even further, infrared or hyperspectral cameras with a large number of bands, such as 1000, may be used. In one example, the system 100 comprises a pair of safety glasses frames during tasks with two lightweight infrared web cameras mounted on it, pointing to each eye. The two web cameras may comprise a light source to illuminate the eye 132 of the user 102, such as two infrared (IR) light emitting diodes (LEDs) soldered in each camera board. The web cameras are connected to a computer 104 with USB 2.0 cables and videos of only eye images are recorded at 30 frames per second in AVI format. IR light is invisible to the eye and is therefore non-intrusive to participants, but produces the darkest pupil effect; hence the quality of the pupil image is improved. The system may measure the eye length in millimetres.
In addition to task transition, this disclosure relates to perceptual load and cognitive load, in relation to some of the generic attributes of the task. Perceptual load occurs when a user is required to perceive more different items, or the same items with more demands on individual identification (e.g. identify an item colour and position at the same time), particularly in a short period of time [11]. It is a rather passive process, and some items cannot be perceived when perceptual load is high [11]. As high perceptual load can exhaust the available capacity for information processing, the format of user interface 128 on display 126 needs to be carefully designed to reduce perceptual load [12,13]. Although perceptual load can be estimated by counting the number of items in a controlled environment (e.g. when the user is staring straight at a known application), it is not easy to estimate it when the items perceived are uncertain (e.g. when the user is mobile or focused on some other task).
Another factor that influences user behaviour in interaction, which can be used to infer task characteristics, is cognitive load. The concept of cognitive load has been used in a variety of fields that deal with the human mind interacting with some external stimulants. The definition of cognitive load is slightly different in each field. For instance, in pedagogical literature cognitive load refers to the total amount of mental activity imposed on working memory at any instance in time; while in ergonomics literature it is described as the portion of operator information processing capacity, or resources that are required to meet cognitive task demands. Each field provides different methods to measure cognitive load. In this specification the phrase "cognitive load" is defined as in the cognitive psychology literature. Cognitive load is defined here as the mental effort or demand required to process information in working memory for a particular user to comprehend or learn some material, or complete some task [F. Paas, et. al., "Cognitive load measurement as a means to advance cognitive load theory". Educational Psychologist, 2003, 38, 63-71.]. Cognitive load is relative to both the user (i.e. their ability to process novel information) and the task being completed (i.e. complexity), at any single point in time. It is attributable to the limited capacity of a person's working memory and their ability to process novel information.
In contrast to perceptual load, cognitive load is associated with working memory and increases with more demands on it [ 14]. It is a more active process than perceptual load [11]. For example, the average accuracy of electroencephalogram (EEG) signals for classifying three load levels has been reported to be 43-90% with an increasing window size from 2-120 s in an n-back memory task [15]. With Functional near-infrared spectroscopy, the average accuracies for two levels of cognitive load may range from 76%-94% in three single tasks [16]. An average accuracy of 81% for low and high cognitive load was reported using a combination of heat flux and electrocardiographic (ECG), where load levels were classified separately in each of the three task types [ 17]. Around 82% accuracy was achieved using a combination of EEG, ECG, electrooculographic (EOG) and respiration for two levels of cognitive load, where the levels were manipulated by the number of events [18]. The highest average accuracies for three levels of cognitive load were reported using speech, namely 72.6% and 88.5% in two different controlled datasets [19].
Cognitive load is a performance based measure and it is noted that a measure of the complexity of the task cannot measure a person's cognitive load. This is an important distinction. Two different people can perform the same task and each be under vastly different cognitive loads. While the complexity of the task may impact the cognitive load of a person, the complexity of the task is not in itself a measure of the cognitive load being experienced by a given individual. That is why measuring cognitive load is so important because things like task complexity (or workload, stress or attention span) are not a measure of cognitive load. This is shown clearly with a practical example - a first person may find a particular task easy and use little mental effort while the second person performing the same task exerts a lot of mental effort. In both cases, the complexity of the task has not changed. Therefore saying that a measure of the complexity of the task is also a measure of each person's cognitive load would simply be wrong as while the task is the same the cognitive load experienced by 2 different individuals is very different.
In order to further clarify the distinction between perceptual load and cognitive load, an exemplary experiment is now provided. It is noted that this experiment or other experiments may be used as a training procedure to determine a transition classification parameter. The transition classification parameter can then be used to determine whether the user transitioned between two tasks. For example, the setup of the training procedure allows the annotation of time windows as task transitions. A measure of change is then determined during the training procedure and a classifier is trained using the annotations and the determined measure of change. In case of the task transitions, the trained classifier is stored on data store 118 in form of transition classification parameters, such as by fitting a logistic regression model to the data. In one example, the transition classifier is based on a logistic function and the parameters ¾ and βι are the transition classification parameters. Similarly, perception classification parameters and cognition classification parameters are determined and stored in data store in order to later determine perceptual load and cognitive load respectively.
Exemplary experiment or training procedure
To collect a measure of change, also referred to as eye activity data, in an intensive task switching situation, we conducted an experiment in which three types of tasks, each involving a different level of perceptual load, and five levels of task difficulty (cognitive load) in each type of task, were deployed. These tasks were continuously performed, and switching between different types and levels of load was induced.
Perceptual load was controlled by the number of stimuli required to be perceived in each task type. The low perceptual load task was a mental arithmetic task, where four numbers sequentially replaced one of the ten 'X's (displayed in a circle on screen) in a 3-s interval, then participants used a mouse to select the correct answer from 10 displayed choices. The medium perceptual load task was a spatial task, where all task stimuli were displayed simultaneously, including the direction labels of two compasses, 12 radio buttons (displayed in a circle) indicating 12 directions, and a pointer to one of the 12 directions of the inner compass (inside the circle). Participants were required to identify the extent of the two misaligned compasses and click the corresponding direction of the pointer in the outer compass (outside the circle).
The high perceptual load task was a search task, where 28 stimuli were not only simultaneously displayed in 28 boxes, which were evenly distributed in two rows and two columns, but also moved by one box every 3 s in one of the line directions: right, left, up and down. The target number was displayed for 2 s at the beginning of the task, and participants were required to click as many targets as possible, although there were only four targets in each task. Cognitive load was induced by increasing the memory set, or the degree of the interactivity of task stimuli. In the arithmetic task, task difficulty levels were regulated by the number of digits for addition (up to 3 digits) and carries produced by addition. In the spatial task, the difficulty level was increased when the spatial compatibility between the two compasses was reduced by removing more direction labels. In the search task, the difficulty from level 1 to 5 was increased by increasing the number of the target digits from 1 to 5. Each digit was selected from { 1,2,..., 9} with minimum repetition. Distractors were selected to have similar digits as much as possible but in a different sequence, to prevent participants from only memorizing a few digits. Tasks were continuously switched from one to another until a message displayed on the screen to indicate an end, for a break. The arithmetic and spatial tasks were internally switched to the next task by a click to submit the answers, and the search task was externally switched, by ending automatically after 12 s. The executed task sequence was always one group of the three task types, followed immediately by another group of them. The sequence of the group was randomized permutations of cognitive load levels, 1 to 5, in groups of three task types, with repetition, therefore 125 (5 x 5 x 5) groups of three were generated in total. The randomized groups were then separated into six parts. The six parts were evenly allocated into two days. Before the three parts each day, there was a rating part where participants rated the task difficulty at the end of each task. Half of the participants began with the rating plus first three parts in the first day and half of them began with the rating plus other three parts. To make it more realistic in terms of luminance variation, we set the three types of task in different background colours, namely, white, grey and black for the spatial, arithmetic and search tasks respectively, since task switching is often associated with background change in real- world tasks.
Twenty-two participants completed all tasks in two different days after a short training period before the experiment. Each part had a duration of less than 25 minutes, and there was a 10-minute break between each part.
Method
Fig. 2 illustrates a method 200 performed by processor 114 for monitoring the user 102 performing tasks. One characteristic of interest is task switching, which often occurs for a priority change and external interruptions [1]. It is an inevitable phenomenon that most people find it difficult to carry out multiple activities simultaneously because of their finite ability.
Tasks include a wide range of activities that the user 102 performs, such as reading an email on device 126 or driving a car. Switching between tasks may be switching between tasks that are both performed on device 126 or may be switching between tasks where only one of the tasks or no task is performed on device 126. It is noted here that the tasks need not necessarily be computing tasks. It might be e.g. the transition between talking to somebody else and reading a book. The key thing is that the computer 104 should be aware of this.
In some examples, device 126 is not included in computer system 104, and task transitions are determined for other purposes such as sending and storing the task detection signal associated with a time value, such as a time stamp, on data store 118. While in one example the processor 1 14 stores the start or end time of a task as the time value, in other examples processor 114 stores the duration of the task as the time value. This way a log of the user's activity is created and the user 102 or another person, such as the user's supervisor, can re- visit the log to examine the user's activity. In one example, the user 102 is an air traffic controller and makes a mistake. At a later stage, the log can be examined and it can be determined whether the user's mistake is related to switching between tasks. Of course, the task detection signal may be stored on a data store that is remote from the computer 104 in Fig. 1.
The method commences by the processor 1 14 receiving 202 multiple consecutive images of the eye 132 of the user 102 captured over a period of time, such as a time window. In one example, the processor 114 selects the multiple consecutive images from a received stream of images, such as a continuous video stream, such as by selecting all images within a time period, such as a time window. Fig. 3 illustrates an exemplary stream of images 302. In this example, the stream of images is a video with 30 full frames per second and the time window is Is. Processor 114 therefore selects 30 frames and covers a time window 304 of Is. The next time window i + 1 306 overlaps with the time window i 304 by half the window time, such as 50%. Processor 1 14 employs an average filter 308 over 3 frames to eliminate noise in the multiple consecutive images 302. In this example, half the window time is exactly 50%. However, slight variations are also half the window time, in particular if the number of windows is odd and no exact half can be determined. As a result, one half can have less images than the other half. In the example above it was found advantageous that the window immediately before the current window and the window immediately after the current window cover the entire current window. As a result, each image of the video stream is included in exactly two time windows.
It is noted here that the described window configuration is applicable for determining task transition, estimating perceptual load and cognitive load as well as for training the classifier based on training data.
Each frame comprises the eye 132 of the user 102, either partially of fully. Using eye activity, that is, change of the eye feature over the multiple consecutive images, for task monitoring has advantages over other psychophysiological and behavioural signals. Firstly, eye activity contains three classes of information in the pupillary response, such as dilation, blink and eye movement (saccade and fixation).
The correlations between the three classes of eye activity are low [6], and as a measure, it is non-intrusive, objective and can be applied in broad scenarios. Furthermore, monitoring eye activity through cameras has been facilitated by recent eye tracking technology [34], making it more convenient and less intrusive. Processor 114 performs video processing in order to determine 204 a value representing an eye feature in each of the multiple consecutive images to obtain multiple consecutive values. In one example, the eye feature is the pupil of the eye 132 and the value is a characteristic of the pupil of the eye and the further processing steps are based on that characteristic, such as the pupil visibility, dilation, diameter as described in more detail below. However, other values representing eye features, such as eyelid position, colour of the sclera or cornea may also be used. The next step of method 200 in Fig. 2 is that processor 114 determines 206 a measure of change that characterises change of the eye feature, such as change of the pupil, over the multiple consecutive images based on the multiple consecutive values representing the pupil, such as multiple measurements of the pupil diameter. In one example, the processor 114 determines three classes of measures of change that measure eye activity, that is, the measures of change characterise change of the pupil over the multiple consecutive images. The pupil is located in the multiple consecutive images and ellipse fitting is employed to estimate the pupil shape. The major axis length in pixels is measured as pupil size. Meanwhile, the minor axis length is recorded for the calculation of eccentricity with the major axis length, and the orientation of the ellipse is also recorded as a feature. The ellipse centre position is logged as eye position for later fixation and saccade separation. Blinks are recorded when the pupil size is occluded by at least half. The fitted ellipse is superimposed onto each video for visual inspection to ensure the true eye activity is correctly represented. The axis lengths, orientation and eye position are linearly interpolated during blinks, and then passed through a median filter with length three frames to remove the noise caused by rapid eye movements. Pupil size is then converted to millimetres by the ratio of the true eye length and the eye length in the video.
Fig. 4 schematically illustrates the eye 132 of the user 102 in more detail. The eye 132 comprises an eye ball, which is mostly covered by an eye lid 404 and the scull (not shown) of the user 102. The eye 132 further comprises an iris 406 and a pupil 408 centrally located in the iris 406. In this example, the iris 406 and the pupil 408 are half covered by the eye lid 404 since the eye 132 was captured during a blink. As a result, when processor 1 14 determines a boundary of the pupil 408, the result is the bottom half of the pupil bounded by the lid 404 at the top, which is shown in Fig. 4 by the solid line of the bottom section of pupil 408 and the solid line of lid 404. Processor 114 fits a first ellipse 410 to the entire boundary of the pupil 408 and as a result, the first ellipse fits entirely within the boundary. Further, the processor 114 fits a second ellipse 412 only to the a section of the boundary that is below the line of the eye lid 404, that is, the part of the boundary that belongs to the pupil 406 itself and not the eye lid 404. As a result, the second ellipse 412 is typically larger than the detected boundary.
The fitting of the ellipses 410 and 412 comprises for each ellipse determining a centre point and major and minor axis lengths by minimising a least squares error function of distances of points on the boundary from the ellipses 410 and 412.
Processor 114 performs the two-ellipse fitting scheme for blink detection, with the rationale that the half bottom of the pupil boundary is less vulnerable to occlusions and the centre positions of the two ellipses are different during blinking. Therefore, the difference of the two centre positions is set to be half of the major axis length of the first ellipse as the blink threshold.
In another example, processor 114 detects a blink if 2 bi to save the computation of calculating the distance of the centres while similar performance is achieved. Here, bi is the minor axis of the first ellipse 410 using the whole boundary and is the major axis of the second ellipse 412 using the lower half of pupil boundary.
To automatically detect blinks, each frame is firstly converted into a binary image, and geometric and temporal criteria are used to find the pupil blob. Then the whole boundary and the half bottom of the boundary are located to fit ellipses. It is noted that this process is also performed in order to calculate pupil dilation as described below.
Fixation and saccade are separated by dispersion-based algorithms [25] using one degree of visual angle for at least 200 ms. As the processor does not detect the head motion in this example, the fixation and saccade used in this example are the relative fixation and saccade to the head position. A time span of 0.5 s is defined before the end of the previous task to 0.5 s after the beginning of the current task as a task transition. In one example, 0.5 s is the maximum latency of an eye activity response, from a stimulus to brain and brain to eye activity, as indicated by neuroimaging studies [26]. Therefore, a 1-s time window with 50% overlap may be used in task transition detection. From the recorded task timeline, all time windows that crossed the instant of task beginning or end as task transitions in the ground truth are labelled.
Measures of change
From the three classes of eye activity, processor 1 14 calculates a total of 29 measures of change, including statistics, from the time domain signals, frequency domain and entropy. The measures of change are calculated from a time window of 1 s (1-s time window), and from a time window of 2 s (2-s time window), with 50% overlap. The measures of change characterise the change of the eye over multiple images, such as frames or averaged frames, and therefore each measure of change is based on multiple consecutive values representing the eye feature, such as multiple measurements of the pupil diameter.
From the multiple consecutive values representing eye features, that is, the pupil time series, processor 114 calculates the average pupil size (PDmean), pupil size range (PDrang), pupil size standard deviation (PDstd), average pupil orientation (POmean), pupil orientation range (POrang), pupil orientation standard deviation (POstd), average pupil eccentricity (PDeccMean), pupil eccentricity range (PDeccRange), and pupil eccentricity standard deviation (PDeccStd), over the time window.
For perceptual load and cognitive load estimation, the above measures are calculated after pupil baseline subtraction to reduce the light reflex effect. The baselines are the average pupil size during the 0.5 s at the end of each task transition. For this it is assumed that each task transition had been perfectly detected. This means that perceptual load and cognitive load are only determined for time windows where no task transition is detected. Alternatively, perceptual load and cognitive load are determined but not provided to a receiver of the load estimate when a task transition is detected.
In a further alternative, the estimates are provided to the receiver but the receiver is notified that the current estimate is inaccurate since the estimates were determined for a time window including a task transition. In this example, the processor 114 determines whether the user transitioned between tasks and sends a detection signal such that the receiver can reject received estimates of the perceptual and cognitive load. Power spectral density from pupil time series may also be considered. Due to the short length of time window, processor 114 performs a parametric method for power spectrum estimation, which avoids spectral leakage and provides a better frequency resolution than FFT-based nonparametric methods. The Burg method for the autoregressive (AR) model is used and the AR model order is set to the length of time window divided by three [27].
In one example, eight average spectral power (Ave_psd) features are then computed on eight frequency bands, spanning 0.1 to 4 Hz (the first band is 0.1 to 0.5 Hz and the second band is 0.5-1.0 and so on). In other examples, the frequency spectrum is divided into different bands.
Blinks can be categorized into two types: functional blink, which only occurs two to four times per minute to moisturize the eyes, and non-functional blink. Non-functional blink includes reflex blink (a protective response, e.g. to a puff of air), voluntary blink (a purposeful response depends on one's will) and endogenous blink (occurs unconsciously). However, the majority of human blink behaviour is endogenous blink, which is centrally controlled and has strong links to cognition [9]. Therefore, for the purpose of cognitive load estimation, a task goal and knowledge of the task duration are helpful to exclude most non-cognitive blinks by selecting appropriate task time windows.
With regard to blink detection, video-based sensors may be used. One approach is to observe the pupil diameter waveform. Blinks may be estimated by detecting points of zero pupil diameter (from eye tracker output), due to partial eye closure, from a sequence of frames. However, the number of frames that are treated as blink with partial eye closure may be empirically decided because the exact non-occluded pupil size is unknown.
A more advanced technique is to calculate the eyelid distance using the active appearance model, where both eye shape and appearance variation are learned from training data, and an initialization procedure is usually required before detection [5]. Example techniques include template matching and detecting luminance change in the eye regions to infer blink occurrence from far-field web cameras [11].
In one example, four blink features are considered as measures of change of the pupil. Blink number (blkNum) is calculated during the time window. Inter-blink interval (blklnterval) is the number of non-blink frames during the time window. Blink duration per blink number (blkDurPerNum) is the number of blink frames divided by the total number of blink. Blink product (blkProduct) is the multiplication of blink number per second and blink duration per second.
In another example, processor 1 14 uses four measures within each analysis time window. Blink rate (blink per second) is calculated as the number of blinks occurred divided by the time window duration. Blink duration per blink is the sum of blinks during the time window normalized by the total number of blinks occurred in that time. Inter-blink interval per second is the sum of the number of non-blink frames during the time window normalized by the duration. Processor 114 also calculates the blink product per second, which is the product of blink rate and blink duration per second.
To classify the cognitive load levels and different perceptual load tasks, the time window may be set as the task duration after removing the time span of 0.5 s after task beginning and 0.5 s before task end, so that during the time window, a user is actually engaged in tasks. The time span from 0.5 s before the end of the previous task to 0.5 s after the beginning of the current task is defined as the task transition state, which involves task environment change at the beginning of the task and requires button clicking at the end.
Fixation and saccade measures may also be considered. From the time series, the processor 114 calculates the number of fixations (fixNum) during the time window, the fixation duration per fixation (fixDur), the standard deviation of the fixation duration (fixDurStd), and the first difference of fixation duration (fixDurDiff). In one example, processor 1 14 uses an entropy-based feature on fixation (fixEntropy), where an empirical probability mass function is generated by normalising a sequence of fixation durations (in frames) by the total number of frames in the time window, and calculate the Shannon entropy from this. For the saccade features, processor 114 computes the average amplitude of saccade (sacAmp), the standard deviation of saccade amplitude (sacAmpStd) and the first difference of saccade amplitude (sacAmpDiff). Referring back to Fig. 2, the next step of method 200 is determining 206 whether the user transitioned from a first task to a second task during the period of time, that is, the time window, based on the measure of change. This determination step is now explained in detail. Task transition detection
In this example, processor 114 computes all measures from a 1-s time window, overlapped by 0.5 s, for task transition detection. Sensitive features are initially selected using a pilot test, in which the data from the first two parts of each day is taken as the pilot training set (half of them are randomly selected as a validation set) and the data from the third part of each day as the pilot test set. Kappa scores are obtained from the classification decisions and the annotated ground truth. Data from the fourth part of each day is reserved for later use (testing data). In one example, the top 12 features ranked by the Kappa scores may be collected as a full set of sensitive features. Then, the best measures from the full set of sensitive measures and classifier parameters can be selected from training data to form the input of a classification system.
In one example, processor 1 14 applies a median filter of three frames to the classifier output to remove rapid fluctuations due to noise in task transition detection.
When reporting classification results, two schemes may be used to get a sense of how close the detected task transitions are to annotations made by the experimental setup. Moment-by-moment Scheme: Task transitions are detected every 0.5 s, and the classification performance is obtained by comparing the classification decision with the annotated ground truth frame by frame. State-by-state Scheme: From the detected task transitions above, any transition with frames overlapping with the annotation is recognized as task transition, and declared as a task transition for classification performance calculation. Perceptual load estimation
In one example, processor 114 determines the perceptual load (low/medium/high) every 1 s independently of cognitive load, using the measures of change derived from a 2-s time window, not including task transitions. In examples where blink and fixation measures extracted over 1 s are not able to produce measurable magnitudes for perceptual load, as observed from the data, time windows of more than 2 s long may be considered in practice. The subsequent procedures are similar to those for task transition detection, but without median filtering.
In one example, the processor 114 segments the signals into task windows according to determined task transition, and aggregates the classifier outputs using majority vote during the task window [28].
Moment-by-moment Scheme: Three perceptual load levels are estimated every 1 s. The classification performance is obtained by comparing the classification decision with the annotated ground truth frame by frame.
Task-by-task Scheme: From the estimated perceptual load levels above, processor 114 uses the label that won the majority vote from a task window to represent the task. Cognitive load estimation
Similarly to perceptual load estimation, features derived from a 2-s time window are selected for cognitive load estimation, without the influence of task transition.
One factor that can influence cognitive load measurement is the level of perceptual load of the task. According to the load theory of attention, cognitive control typically occurs in situations of low perceptual load [11], which implies that cognitive load levels during high perceptual load tasks may not be distinguishable. Therefore, high perceptual load tasks may be excluded during model training. In one example, the classification decision for cognitive load is available only when the perceptual load is not high and therefore, cognitive load is selectively determined where perceptual load is below a threshold. As explained above, perceptual load is determined in three levels low, medium and high. As a result, the cognitive load is determined selectively where perceptual load is below medium, that is, in the low level. Alternatively, the cognitive load is determined selectively where perceptual load is below high, that is, in the low or medium level.
The other procedures and the schemes for classification decision are similar to that for perceptual load estimation. Three cognitive load levels were estimated every 1 s when the perceptual load was not high.
Measure selection and classification
To lower the computational costs and shorten the runtime, processor 114 may prune the sensitive measures set to a number of subsets for further selection of measures. The principle of the pruning is to contain two or three subsets where measures are from each class of eye activity and the subsets where measures are from different classes of eye activity. This is to minimize the negative influence on classification performance due to individual difference on sensitive measures. For example, some participants have very low blink rate, which may result in blink features losing effect.
It is noted here that no combination of measures is excluded and that the particular combination of measures may be based on the characteristics of the particular user. In the following explanation, measures are selected that carry the most information. This is based on measurements from one or more particular users. The processor 114 may automatically chose the optimal measures for a particular user based on a training set, where time windows are annotated with a ground truth based on the experimental setup described earlier. This way, some of the time windows are labelled as task transitions, while others are labelled with different levels of perceptual or cognitive load.
In one example, processor 114 performs logistic regression for classification, for simplicity and to avoid over fitting, and employs a participant-dependent scheme. In different examples, processor 114 uses the Parzen Window or a Gaussian Mixture Model. In examples of real time application, k-fold cross validation may not be used. It is noted here that for task transition detection, a yes/no decision is made by the classifier, while for perceptual and cognitive load estimation the classifier may distinguish between multiple levels of each load. In one example, the same classifier, such as logistic regression, is used while in other examples, different classifiers are used to detect task transition and to estimate perceptual and cognitive load.
In other examples, a participant-dependent scheme is used and different models for the three classification tasks are trained, using the task duration as the time window. For each task, a model for each load level or state is trained using blink features, and then a test task is classified according to the maximum posterior probability or least loss. To classify the cognitive load level and task type, a 10-fold cross validation scheme may be used.
The data partitions are as follows: data from the first three parts of each day forms the training data. Among the training data, 50% is randomly selected as a validation data to tune not only the regularization parameter but also the best subset of the measures. Data from the fourth part of each day may be used as the testing data.
Concerning the distribution of training data in each class, the training process faces a class imbalance problem. As easy tasks usually take less time than difficult tasks, this leads to more training samples in difficult tasks when a time window with fixed length is used. This problem is more severe for task transition detection since the number of transitions in most cases is significantly less than that of non-transitions. To deal with this, both downsampling and ensemble classifier schemes may be employed, where the number of classifier is the class imbalance ratio (integer). Each classifier is trained by the entire minority class and an equally sized, random subset of the majority class. The final predicted probability distribution is the averaged predicted probability from all the classifiers [29]. A test sample is then assigned the label corresponding to the maximum predicted probability.
In one example, three metrics are employed to describe the classification performance. Each of them is the averaged value across 22 participants. Correlation coefficient (R) measures the strength and direction of the relationship between the ground truth and the estimated labels. The absolute value of R in the ranges of 0-0.3, 0.3-0.7 and 0.7-1.0 can be considered to indicate weak, moderate and strong linear relationships respectively. Cohen's Kappa may also be used, which measures the magnitude of classification agreement and takes into account chance agreement. Generally speaking, a Kappa score between 0.01 - 0.20 can be interpreted as having slight agreement, 0.21-0.40, fair agreement (above 0.20 is often interpreted as statistical significance), 0.41-0.60, moderate agreement and 0.61- 0.80, substantial agreement.
Table 1 shows the top 12 most sensitive measures of change for task transition. We can see that the average power in the frequency band of 0.5-1.0 Hz is the most sensitive measure. Meanwhile, all the sensitive measures are from two classes of eye activity, that is, pupillary response (*psd*) and blink (blk*). Measures of change from eye movements have little contribution to task transition.
Table 1. 12 measures of change with Kappa score above 0.11 were
selected as a full feature set for task transition detection.
Figure imgf000027_0001
Perceptual load estimation
Regarding the sensitive features for perceptual load, as Table 2 shows, blink measures (blk*) were the most sensitivemeasures, followed by fixation and saccade (sac*, fix*). The average blink number in low and medium perceptual load tasks is higher than that in high perceptual load task. Pupillary response has less influence on perceptual load, since pupil size varies in a similar range in each perceptual load task.
Table 2. 12 measures of change with Kappa score above 0.20 were
selected as a full feature set for perceptual load estimation.
Measure of change|Kappa Measure of change Kappa blkNum 0.30 fixDurStd 0.23
blkDurPerNum 0.29 fixNum 0.23 blklnterval 0.29 fixDur 0.23 sacAmp 0.27 fixEntropy 0.23 blkProduct 0.27 PDeccRange 0.21 fixDurDiff 0.24 PDeccStd 0.20
Cognitive load estimation
As Table 3 shows, without the impact of high perceptual load, the average pupil size is the most sensitive measure for cognitive load. The top 12 sensitive measures are mainly from the classes of pupillary response (P*) and eye movement (sac*, fix*).
Table 3. 12 measures of change with Kappa score above 0.08 were
selected as a feature set for cognitive load estimation.
Figure imgf000028_0001
To improve the classification performance of cognitive load, there may be a trade-off between the resolution (number of load levels), time window and accuracy for cognitive load estimation. In some examples, 2 s may be too short for strong measurable changes in features, but some task durations are as short as 2 s in the easiest tasks. In practice, these are the parameters to be considered, and the values should be decided based on the application. It is interesting to find that the sensitive measure sets for classifying task transition, perceptual load and cognitive load respectively comprised different classes of eye activity (as indicated in Table 1, 2 and 3). For task transition, the effective measures are mainly from pupil power spectrum density, and blink. Pupil size often drops rapidly at the end of one task and increases again when a new task begins [10]. This suggests that pupil size does not change frequently during a 1-s time window during task execution.
Measures for perceptual load are mainly from blink, fixation and saccade. Blink, fixation and saccade are correlated with the number of items to be perceived [6,21]. This confirms the efficacy of using fixation duration to indicated perceptual load [12]. For cognitive load, the major features are from pupil and eye movement, similarly to [22,24]. At least two classes of eye activity are capable to distinguish one of the task characteristics, which provide sufficient measures for selection to overcome individual differences on feature preference. For example, the blink rate may be too low to be used for particular users. However, by selecting measures from another class of eye activity, their classification performance achieved may be fair.
The method 200 of Fig. 2 may be repeated, that is, iterated, such that a video stream from the camera 106 is continuously processed in consecutive time windows and the user is continuously monitored. A task monitoring system can analyse tasks using a 1-s window, which may be suitable for many real time applications, which means the processing time needed by processor 114 to process the images of the 1 s window is less than I s. As a result, the processor 114 finishes processing the window of a current repetition before the next window becomes available and no backlog occurs. If the processor has a different processing power, the number of images, which is directly related to the time period or size of the window and the frame rate, may be changed such that real time processing is possible.
In examples where the method 200 is repeated, it is possible to use results of a previous repetition for a current repetition. For example, the time period for estimating perceptual or cognitive load may extend from one detected task transition to the next detected task transition and may therefore span less than 1 s or more than 1 s, such that a single load estimate is determined for the entire time while the user 102 performs that task.
It is noted here that in some examples the processor 114 determines only the task transitions while in other examples, processor 114 determines perceptual load based on the task transitions. In further examples, processor 114 determines cognitive load based on determined perceptual load and task transitions. As a result, the computational effort and therefore the number of frames or the size of each time window may be different for these different examples. Processor 114 may process the data in a way that is suitable for real time implementation. For example, the classifier may be trained by the data from the first three parts of each day, where the optimum features and model parameters were tuned.
Once the processor 1 14 determined that a user transitioned from a first task to second task is in the last step 208 in Fig. 2, the processor sends a detection signal. In one example, the processor 114 sends the detection signal to a computer system, such as the computer system that controls the display of the user interface 128 on the display device 126. Of course, this computer system may be the same computer system as system 104 in Fig. 1. By receiving the detection signal, the computer system is informed that the user transitioned between tasks and that it is now the best moment to present a notification to the user. Such a notification can be an acoustical signal, such as a ring tone, or a visual alert, such as a blinking or appearing symbol on display device 126, such as a toast pop-up at the bottom of screen that informs the user about a new email or instant message.
In other examples, the computer system that receives the detection signal is a control system, such as an instrument panel in an airplane cockpit, and a notification, such as a status message about the airplane's systems is displayed to the pilot only when it is determined that the pilot transitioned from one task, such as operating the radio system, to another task, such as navigating the airplane.
Similar to the above examples, processor 114 may further send notifications related to perceptual or cognitive load. For example, processor 1 14 may generate a notification when the load is greater than a given threshold or may log the load periodically on data store 118 associated with a time value.
Further, processor 114 may send the detection signal to a computer system to trigger a response of the computer system, such as displaying a notification to the user 102. Fig. 5 illustrates parts of a wearable device 500 integrated into a frame of glasses (not shown). A user wears the device 500 and the device 500 monitors the user performing tasks. The device 500 comprises the camera 106 from Fig. 1. The device 500 further comprises a processing module 502 including a windowing module 504 connected to a feature extractor 506, which is in turn connected to an analyser 508. The analyser 508 is connected to a task transition detector 510, a perceptual load estimator 512 and a cognitive load estimator 514. The task detector 510, perceptual load estimator 512 and cognitive load estimator 514 are connected to output port 516.
When in use, the camera 106 captures a continuous stream of images of the eye of the user and provides the stream to the windowing module 504. The windowing module selects multiple consecutive images from the continuous stream so that the multiple consecutive images cover a set period of time. In a different example, the windowing module 504 is integratedinto the camera, such that the camera captures over the period of time multiple consecutive images of an eye of the user and there is no need to select the images from a video stream.
The windowing module 504 or camera 106 provides the multiple consecutive images to the feature extractor 508 that analyses the multiple consecutive images and determines a value representing the pupil by detecting and measuring the pupil in the multiple consecutive images. The feature extractor 506 passes the determined values, such as centre point coordinates and axes lengths to the analyser 508. In another example, the feature extractor 506 also detects blinks and sends the blinks in form of time values or frame numbers to the feature extractor 508. In a different example, the feature extractor 506 provides a pulse on a signal line when a blink is detected. The analyser 508 determines a measure of change, such as blink rate, based on the data received from the feature extractor 506. In one example the feature extractor 506 and the analyser 508 are combined to determine a measure of change directly from the multiple consecutive images as described earlier.
The measure of change is then sent to the task transition detector 510 that determines whether the user transitioned from a first task to a second task during the period of time based on the measure of change from the analyser 508. Finally, the task transition detector 510 uses output port 516 to send a detection signal if the task transition detector 510 determined that the user transitioned from the first task to the second task. The perceptual load estimator 512 estimates the perceptual load based on the measure of change determined by the analyser 508 and based on the task transition from the task transition detector 510. The cognitive load estimator 514 estimates the cognitive load based on the measure of changes from the analyser 508, the perceptual load from the perceptual load estimator 512 and the task transition from the task transition detector 510.
The methods, software and systems disclosed above may be used for a variety of applications, such as in the following examples.
Task analytics
Processor 114 may log users' tasks throughout the day and analyse them in any way to discover more about how they are using their day, how long they take on particular tasks, how much of a particular type of load they are experiencing, whether they are overloaded etc. in a similar way to other analytics applications, such as bike computers or wearable biomedical logging devices.
Behaviour change
Processor 114 may automatically alert users if they are doing too much or too little of something in particular, with a view to modifying their behaviour. Interruption management
Processor 114 may notify users (using any application) of an incoming message/email/call/requirement/etc only during task breakpoints. This may reduce task error by 50%. Tele-assistance
One advantage of wearable computing is that it allows a user to share their view of their task (through the head-mounted outwards-facing camera) with a remote expert, who then guides them through their task or collaborates with them on it. For the remote expert, it will be an advantage to receive an indication of when to interject to minimise distraction and maximise guidance utility for the user engaged in the task, which is difficult using existing solutions when the user is remote from the expert.
Biomedical systems
So far biomedical engineering has been mainly concerned with physiological signals, but this is slowly changing to consider behavioural signals. The latter have particular utility in diagnosis, monitoring and treatment of mental disorders. Some that this invention may be particularly useful for include depression and dementia Alzheimer's disease, but mental disorders are by definition mental or behavioural patterns or anomalies, so the utility may be quite broad. Affective computing
The term refers to determining the emotional state of the user. As explained above, measuring load continuously, that is, during task transitions, has little significance. The same applies to measuring emotion. Therefore, some of the methods disclosed here, particularly accurate pupil diameter measurement, may be applicable for measuring emotion.
It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the specific embodiments without departing from the scope as defined in the claims. For example, the measure of changes, task transition, and perceptual and cognitive load may be used as biomarkers for diagnosis of mental diseases or measures of cognition, for computer games or task informatics from eye activity (post-hoc analysis).
It should be understood that the techniques of the present disclosure might be implemented using a variety of technologies. For example, the methods described herein may be implemented by a series of computer executable instructions residing on a suitable computer readable medium. Suitable computer readable media may include volatile (e.g. RAM) and/or non-volatile (e.g. ROM, disk) memory, carrier waves and transmission media. Exemplary carrier waves may take the form of electrical, electromagnetic or optical signals conveying digital data steams along a local network or a publically accessible network such as the internet.
It should also be understood that, unless specifically stated otherwise as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as "estimating" or "processing" or "computing" or "calculating" or "generating" or "detecting", "optimizing" or "determining" or "displaying" or "maximising" or the like, refer to the action and processes of a computer system, or similar electronic computing device, that processes and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
REFERENCES
[I] Smith, W., The Complex Structure of Office Works: Tasks, Activities and Modes, OZCHI2004.
[2] Bailey, B. P., Konstan, J. A, Carlis J. V., Measuring the Effects of Interruptions on
Task Performance in the User Interface, In 2000 IEEE International Conference on Systems, Man, and Cybernetics, 757-762.
[3] Czerwinski, M., Horvitz, E. & Wilhite, S., A diary of task switching and interruptions, Proceedings of CHI 2004, Vienna, 175-182.
[4] Gonzalez, V.M. & Mark, G., Constant constant, multitasking craziness: managing multiple working spaces, Proceedings of CHI 2004, Vienna.
[5] Bailey, B. P., Adamzyk, P. D., Chang, T. Y., Chilson, N. A, A framework for specifying and monitoring user tasks, Computers in Human Behavior, 22 (2006),
709- 732.
[6] Chen, S., Epps, J., Ruiz, N., Chen, F. Eye activity as a Measure of Human Mental
Effort in HCI. In Proc. IUI2011, Palo Alto, USA, 67-70.
[7] Altmann, E. M., Timecourse of recovery from task interruption: Data and a model, Psychonomic Bulletin & Review, 14, 6 (2007), 1079-1084.
[8] Adamczyk, P. D., Bailey, B. P., If Not Now, When: The Effects of Interruption at
Different Moments Within Task Execution, CHI2004, Vienna.
[9] Basoglu, . A., Fuller, M., A., Sweeney, J. T., Investigating the effects of computer mediated interruption: An analysis of task characteristics and interruption frequency on financial performance, International Journal of
Accounting Information Systems, 10 (2009), 177-189.
[10] Iqbal, S. T., Adamczyk, P. D., et.al. Towards and Index of Opportunity:
Understanding Changes in Mental Workload during Task Execution, CHI 2005,
Portland.
[I I] Lavie, N., Hirst, A, de Fockert, J. W., Load Theory of Selective Attention and Cognitive Control, Journal of Experimental Psychology: General, 133 (2004), 339- 354.
[12] Lee, J., Forlizzi, J., Hudson, S. E., Iterative design of MOVE: A situationally appropriate vehicle navigation system, Int, J. Human-Computer Studies, 66 (2008) 198- 215.
[13] Stedmon, A. W., Kalawsky, R. S., Hill, K., Cook, C. A., Old Theories, New Technologies: Cumulative Clutter Effects Using Augmented Reality, In Proc. IEEE International Conference on Information Visualization, 1999, 132-137. [14] Paas, F., Tuovinen, J. E., Tabbers, H. and Van Gerven, P. W. M. Cognitive Load
Measurement as a Means to Advance Cognitive Load Theory. Educational
Psychologist 38, 1 (2003), 63 - 71.
[15] Grimes, D., Tan, D. S., Hudson, et,al. Feasibility and pragmatics of classifying working memory load with an electroencephalograph. CHI'2008.
[16] Hirshfield, L. M., Gulotta, R., et. al. This is Your Brain on Interfaces: Enhancing
Usability Testing with Functional Near-Infrared Spectroscopy, CHI 2011,
Vancouver.
[17] Haapalainen, E., Kim, S., Forlizzi, J. F., and Dey, A. K. Psycho-physiological measures for assessing cognitive load. In Proc. UbiComp 2010, 301-310.
[18] Wilson, G. F., Russell, C. A. Real-Time Assessment for Mental Workload Using
Psychophysiological Measures and Artificial Neural Networks, Human Factors,
45, 4 (2003), 635-643.
[19] Le, P., Ambikairajah, E., Epps, J., Sethu, V., Choi, E.H.C., Investigation of spectral centroid features for cognitive load classification, Speech communication, 53, 4 (2011), 540-551.
[20] Stern, J. A., Walrath, C, Goldstein, R., The endogenous eyeblink, psychophysiology, 21 ( 1984), 22-33.
[21] Van Orden, K. F., Limbert, W., et al. Activity Correlates of Workload during a
Visuospatial Memory Task, The Journal of the Human Factors and Ergonomics
Society, 43 (2001), 111-121.
[22] Chen, S., Epps, J., Chen, F., A Comparison of Four Methods for Cognitive Load
Measurement. In Proc. OzCHI 2011, Canberra, Australia, 315-318.
[23] Nakayama, M., Shimizu, Y., Frequency Analysis of Task Evoked Pupillary
Response and Eye-movement, In Proc. of the 2004 symposium on Eye tracking research & applications, 71-76.
[24] Jacob, R. J. K., Karn, K. S., Eye Tracking in Human- Computer Interaction and
Usability Research: Ready to Deliver the Promises, The Mind's eye: Cognitive and Applied Aspects of Eye Movement Research, 2003, 573- 603.
[25] Salvucci, D. D., Goldberg J. H., Identifying Fixations and Saccades in Eye- tracking Protocols, In Pro. of The 2000 Symposium on Eye Tracking Research and Applications, New York, 71-78.
[26] Laeng B., Sirois S. and Gredeback G., Pupillometry: A window to the preconscious? Perspectives on psychological science, 7(1), 2012, 18-27.
[27] Proakis, J. G., Manolakis, D. G., Digital Signal Processing, Upper Saddle River,
New Jersey, 2007, 986-1009. [28] Bulling, A., Ward, J. A., Gellersen, H., Troster, G., Eye Movement Analysis for Activity Recognition Using Electrooculography, ΓΕΕΕ Transactions on Pattern Analysis and Machine Intelligence, 33, 4 (2011), 741- 753.
[29] Klement, W., Wilk, S., Michalowski, W., Matwin, S., Classifying Severely Imbalanced Data, In Proc. of the 24th Canadian conference on Advances in artificial intelligence, 2011, 258-264.
[30] Monsell, S., Task switching, Trends in Cognitive Science, 7, 3 (2003), 134-140.
[31] Yeh, Y. Y., and Wickens, C. D., Dissociation of performance and subjective measures of workload, Human Factors, 30 (1988), 111-120.
[32] Parasuraman, R., Sheridan, T., B., & Wickens, C, D. Situation Awareness, Mental Workload, and Trust in Automation: Viable, Empirically Supported Cognitive Engineering Constructs, Journal of Cognitive engineering and Decision Making, 2, 2 (2008), 140-160.
[33] Vidulich, M. A., Wickens, C. D., Causes of dissociation between subjective workload measures and performance: Caveats for the use of subjective assessments, Applied Ergonomics, 17, 4 (1986), 291- 296.
[34] Li, D., Babcock, J., Parkhurst, D. J., OpenEyes: a lowcost head-mounted eye- tracking solution, In Proc. of the 2006 symposium on Eye tracking research & applications, 2006, 95-100.
[35] Jameson, A., Modeling Both the Context and the User, Personal Technologies, 5, 1(2001).
[36] Grootjen, M., Neerincx M. A., Veltman, J. A., Cognitive task load in a naval ship control centre: from identification to prediction, Ergonomics, 49, 12-13, (2006), 1238-64.

Claims

CLAIMS:
1. A method for monitoring a user performing tasks, the method comprising:
(a) receiving multiple consecutive images of an eye of the user captured over a period of time;
(b) determining a value representing an eye feature in each of the multiple consecutive images to obtain multiple consecutive values;
(c) determining a first measure of change that characterises change of the eye feature over the multiple consecutive images based on the multiple consecutive values; and
(d) determining whether the user transitioned from a first task to a second task during the period of time based on the first measure of change.
2. The method of claim 1, further comprising receiving one or more transition classification parameters that are based on the first measure of change during historical task transitions, wherein determining whether the user transitioned from a first task to a second task during the period of time is based on the transition classification parameters.
3. The method of claim 2, wherein the transition classification parameters are logistic regression parameters.
4. The method of any one of the preceding claims, further comprising sending a detection signal if it is determined that the user transitioned from the first task to the second task.
5. The method of claim 4, wherein sending the detection signal comprises sending the detection signal to a data store to store the detection signal associated with a time value.
6. The method of claim 4 or 5, wherein sending the detection signal comprises sending the detection signal to a computer system to trigger a response of the computer system.
7. The method of any one of the preceding claims, further comprising presenting a notification to the user if it is determined that the user transitioned from the first task to the second task.
8. The method of any one of the preceding claims, further comprising selectively determining a second measure of change and determining perceptual load based on the second measure of change for the time period where it is determined that the user did not transition from a first task to a second task within that time period.
9. The method of any one of the preceding claims, further comprising selectively determining a third measure of change and determining cognitive load based on the third measure of change for the time period where it is determined that the user did not transition from a first task to a second task within that time period.
10. The method of claim 9, wherein cognitive load is selectively determined where perceptual load is below a threshold.
11. The method of any one of the preceding claims, further comprising:
receiving a stream of images;
selecting from the stream of images the multiple consecutive images based on the period of time; and
repeating the steps of receiving, selecting, determining the value, determining the first measure of change and determining whether the user transitioned from a first task to a second task,
wherein the period of time of a current repetition is based on the determination whether the user transitioned from a first task to a second task in one or more preceding repetitions.
12. The method of claim 11, wherein the period of time of a first repetition overlaps with the period of time of a second repetition immediately preceding or following the first repetition, such that one or more of the multiple consecutive images of first repetition are also included in the multiple consecutive image of the second repetition.
13. The method of any one of the preceding claims, wherein the number of the multiple consecutive images is such that the processing time of one repetition of the method by a processor is shorter than the period of time.
14. The method of any one of the preceding claims, further comprising:
locating a boundary of the pupil of the eye in each of the multiple images; and fitting one or more ellipses to the boundary, each ellipse defining an axis, wherein determining the value representing the eye feature is based on one or more of the axes of the one or more ellipses.
15. The method of claim 14, wherein the fitting step comprises fitting a first ellipse to the boundary and fitting a second ellipse to only a section of the boundary that is located below a dividing line through the pupil.
16. The method of claim 14 or 15,
wherein the first ellipse defines a minor axis and the second ellipse defines a major axis, and
wherein determining the measure of change comprises detecting a blink if the major axis of the second ellipse is greater than twice the minor axis of the first ellipse.
17. The method of claim 15, wherein each of the first ellipse and the second ellipse defines a centre point; and
wherein determining the measure of change is based on a difference between the centre point of the first ellipse and the centre point of the second ellipse.
18. The method of claim 17, wherein the second ellipse further defines a major axis and determining the measure of change comprises detecting a blink if the difference is less than half the major axis of the second ellipse.
19. The method of any one of the preceding claims, wherein the measure of change is based on one or more of:
blink,
pupil dilation,
fixation, and
saccade.
20. The method of any one of the preceding claims, wherein the measure of change is based on one or more of:
average pupil size,
pupil size range,
pupil size standard deviation,
average pupil orientation, pupil orientation range,
pupil orientation standard deviation,
average pupil eccentricity,
pupil eccentricity range,
5 pupil eccentricity standard deviation,
number of fixations during the period of time,
fixation duration per fixation,
standard deviation of the fixation duration,
first difference of fixation duration,
10 entropy-based feature on fixation,
average amplitude of saccade,
standard deviation of saccade amplitude,
first difference of saccade amplitude,
average spectral power over a frequency bands,
15 blink number,
inter-blink interval,
blink duration per blink number,
blink product,
blink rate, and
20 blink duration per blink.
21. Software, that when installed on a computer causes the computer to perform the method of any one or more of claims 1 to 20.
25 22. A computer system for monitoring a user performing tasks, the computer system comprising:
an input port to receive multiple consecutive images of an eye of the user captured over a period of time; and
a processor
30 to determine a value representing an eye feature in each of the multiple consecutive images to obtain multiple consecutive values;
to determine a first measure of change that characterises change of the eye feature over the multiple consecutive images based on the multiple consecutive values, and
35 to determine whether the user transitioned from a first task to a second task during the period of time based on the first measure of change.
23. A wearable device for monitoring a user performing tasks, the wearable device comprising:
one or more cameras to capture over a period of time multiple consecutive images of an eye of the user;
a feature extractor to determine a value representing an eye feature in each of the multiple consecutive images to obtain multiple consecutive values;
an analyser to determine a first measure of change that characterises change of the eye feature over the multiple consecutive images based on the multiple consecutive values;
a detector to determine whether the user transitioned from a first task to a second task during the period of time based on the first measure of change; and
an output port to send a detection signal if the detector determined that the user transitioned from the first task to the second task.
24. The wearable device of claim 23, further comprising a light source to illuminate the eye of the user.
25. The wearable device of claim 24, wherein the light source is an infra-red light source.
PCT/AU2014/000292 2013-03-19 2014-03-19 Automatic detection of task transition WO2014146168A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2014234955A AU2014234955B2 (en) 2013-03-19 2014-03-19 Automatic detection of task transition

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2013900958A AU2013900958A0 (en) 2013-03-19 Automatic detection of task transition
AU2013900958 2013-03-19

Publications (1)

Publication Number Publication Date
WO2014146168A1 true WO2014146168A1 (en) 2014-09-25

Family

ID=51579220

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2014/000292 WO2014146168A1 (en) 2013-03-19 2014-03-19 Automatic detection of task transition

Country Status (2)

Country Link
AU (1) AU2014234955B2 (en)
WO (1) WO2014146168A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018042261A1 (en) * 2016-09-02 2018-03-08 Tata Consultancy Services Limited Method and system for monitoring of mental effort
EP3466338A1 (en) * 2017-10-03 2019-04-10 Tata Consultancy Services Limited Cognitive load estimation based on pupil dilation
EP3528706A4 (en) * 2016-10-21 2020-06-24 Tata Consultancy Services Limited System and method for digitized digit symbol substitution test
WO2020182281A1 (en) * 2019-03-08 2020-09-17 Toyota Motor Europe Electronic device, system and method for determining the perceptual capacity of an individual human
US11175736B2 (en) 2017-11-10 2021-11-16 South Dakota Board Of Regents Apparatus, systems and methods for using pupillometry parameters for assisted communication
CN114343640A (en) * 2022-01-07 2022-04-15 北京师范大学 Attention assessment method and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6421064B1 (en) * 1997-04-30 2002-07-16 Jerome H. Lemelson System and methods for controlling automatic scrolling of information on a display screen
WO2003050658A2 (en) * 2001-12-12 2003-06-19 Eyetools Techniques for facilitating use of eye tracking data
US20050047629A1 (en) * 2003-08-25 2005-03-03 International Business Machines Corporation System and method for selectively expanding or contracting a portion of a display using eye-gaze tracking
US20100079508A1 (en) * 2008-09-30 2010-04-01 Andrew Hodge Electronic devices with gaze detection capabilities
US20120176383A1 (en) * 2009-12-03 2012-07-12 International Business Machines Corporation Vision-based computer control
US8235529B1 (en) * 2011-11-30 2012-08-07 Google Inc. Unlocking a screen using eye tracking information

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6421064B1 (en) * 1997-04-30 2002-07-16 Jerome H. Lemelson System and methods for controlling automatic scrolling of information on a display screen
WO2003050658A2 (en) * 2001-12-12 2003-06-19 Eyetools Techniques for facilitating use of eye tracking data
US20050047629A1 (en) * 2003-08-25 2005-03-03 International Business Machines Corporation System and method for selectively expanding or contracting a portion of a display using eye-gaze tracking
US20100079508A1 (en) * 2008-09-30 2010-04-01 Andrew Hodge Electronic devices with gaze detection capabilities
US20120176383A1 (en) * 2009-12-03 2012-07-12 International Business Machines Corporation Vision-based computer control
US8235529B1 (en) * 2011-11-30 2012-08-07 Google Inc. Unlocking a screen using eye tracking information

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018042261A1 (en) * 2016-09-02 2018-03-08 Tata Consultancy Services Limited Method and system for monitoring of mental effort
EP3528706A4 (en) * 2016-10-21 2020-06-24 Tata Consultancy Services Limited System and method for digitized digit symbol substitution test
EP3466338A1 (en) * 2017-10-03 2019-04-10 Tata Consultancy Services Limited Cognitive load estimation based on pupil dilation
US11141061B2 (en) 2017-10-03 2021-10-12 Tata Consultancy Services Limited Cognitive load estimation based on pupil dilation
US11175736B2 (en) 2017-11-10 2021-11-16 South Dakota Board Of Regents Apparatus, systems and methods for using pupillometry parameters for assisted communication
WO2020182281A1 (en) * 2019-03-08 2020-09-17 Toyota Motor Europe Electronic device, system and method for determining the perceptual capacity of an individual human
CN114343640A (en) * 2022-01-07 2022-04-15 北京师范大学 Attention assessment method and electronic equipment
CN114343640B (en) * 2022-01-07 2023-10-13 北京师范大学 Attention assessment method and electronic equipment

Also Published As

Publication number Publication date
AU2014234955B2 (en) 2020-01-02
AU2014234955A1 (en) 2015-10-15

Similar Documents

Publication Publication Date Title
Mahanama et al. Eye movement and pupil measures: A review
Santini et al. PuRe: Robust pupil detection for real-time pervasive eye tracking
AU2014234955B2 (en) Automatic detection of task transition
US11928632B2 (en) Ocular system for deception detection
Cecotti et al. Best practice for single-trial detection of event-related potentials: Application to brain-computer interfaces
US20100100001A1 (en) Fixation-locked measurement of brain responses to stimuli
Hosp et al. RemoteEye: An open-source high-speed remote eye tracker: Implementation insights of a pupil-and glint-detection algorithm for high-speed remote eye tracking
Chen et al. Automatic and continuous user task analysis via eye activity
Beltrán et al. Computational techniques for eye movements analysis towards supporting early diagnosis of Alzheimer’s disease: a review
Stuart et al. Quantifying saccades while walking: validity of a novel velocity-based algorithm for mobile eye tracking
KR20190141684A (en) SYSTEM FOR ASSESSING A HEALTH CONDITION OF A USER
Putze et al. Locating user attention using eye tracking and EEG for spatio-temporal event selection
Jayawardena et al. Automated filtering of eye gaze metrics from dynamic areas of interest
KR101955293B1 (en) Visual fatigue analysis apparatus and method thereof
Alzahrani et al. Eye blink rate based detection of cognitive impairment using in-the-wild data
Chen et al. Blinking: Toward wearable computing that understands your current task
Khan et al. Facial expression recognition using entropy and brightness features
JP2024512045A (en) Visual system for diagnosing and monitoring mental health
Remeseiro et al. Automatic eye blink detection using consumer web cameras
Gienko et al. Neurophysiological features of human visual system in augmented photogrammetric technologies
Wang et al. Visual behaviors analysis based on eye tracker in subjective image quality assessment
Wang et al. Towards region-of-attention analysis in eye tracking protocols
Thannoon et al. A survey on deceptive detection systems and technologies
Nishizono et al. Human Eyeblink Detection in the Field Using Wearable Eye-Trackers
Rawat et al. Heart Rate Monitoring Using External Camera

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14769813

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2014234955

Country of ref document: AU

Date of ref document: 20140319

Kind code of ref document: A

122 Ep: pct application non-entry in european phase

Ref document number: 14769813

Country of ref document: EP

Kind code of ref document: A1