WO2005088441A2 - Inserting bits within a data word - Google Patents

Inserting bits within a data word Download PDF

Info

Publication number
WO2005088441A2
WO2005088441A2 PCT/GB2004/003343 GB2004003343W WO2005088441A2 WO 2005088441 A2 WO2005088441 A2 WO 2005088441A2 GB 2004003343 W GB2004003343 W GB 2004003343W WO 2005088441 A2 WO2005088441 A2 WO 2005088441A2
Authority
WO
WIPO (PCT)
Prior art keywords
register
shift
value
data
bits
Prior art date
Application number
PCT/GB2004/003343
Other languages
French (fr)
Other versions
WO2005088441A3 (en
Inventor
Simon Andrew Ford
Paul Matthew Carpenter
Original Assignee
Arm Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Arm Limited filed Critical Arm Limited
Priority to EP04743646A priority Critical patent/EP1723512A2/en
Priority to JP2007502375A priority patent/JP2007528545A/en
Publication of WO2005088441A2 publication Critical patent/WO2005088441A2/en
Publication of WO2005088441A3 publication Critical patent/WO2005088441A3/en
Priority to IL177507A priority patent/IL177507A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30018Bit or string instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30032Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations

Definitions

  • This invention relates to data processing systems. More particularly, this invention relates to the insertion of bits within a data word under program control.
  • one possible solution is to provide program instructions which specify both a length of a bit field within a source register which is to be inserted into a destination register and the position within the destination register at which that bit field is to be inserted.
  • Such an instruction will typically have to specify the source register, the destination register, the bit field length and the bit field insertion position. Having to specify four separate parameters within a single instruction in this way places a disadvantageously high demand upon the instruction bit space available within the instruction and makes such instructions disadvantageous in terms of the instruction bit space they consume.
  • the present invention provides apparatus for processing data, said apparatus comprising: a plurality of registers operable to store data values to be manipulated; processing logic operable to perform a data processing operation upon one or more data values stored within said plurality of registers; and an instruction decoder responsive to a program instruction to control said processing logic to perform a data processing operation specified by said program instruction; wherein said instruction decoder is responsive to a shift-and-insert instruction to control said processing logic to perform a shift-and-insert data processing operation yielding a result having a result value given by: shifting a first data value stored within a first register by a shift amount of N bit positions to form a shifted value including N shifted-in bits, where N has one of a plurality of different non-zero values; and inserting respective bits of said shifted value, other than said N shifted-in bits, into corresponding bit positions in a second data value stored within a second register with bits within said second data value corresponding to said N shifted-in
  • the present invention recognises that a large proportion of the cases in which it is desired to use such instructions the full flexibility of being able to separately specify the bit field length and the bit field position is not required. Instead, a single parameter specifying the amount of shift to be applied to the source value controls the starting position at which bits of the source value are written into the destination value. The bits written into the destination value extend from the starting position to the appropriate end of the destination value depending upon the shift direction of the instruction concerned. It may be that a greater number of bits are inserted than are ultimately required.
  • the present technique recognises that in a large proportion of cases multiple such instructions are executed and the excess bits written in one instruction will be over-written with the desired data in a following instruction such that the final packed data value will contain the correct bits as desired.
  • the packing type of operations, or other bit assembly operations can be achieved with instructions having an advantageously small instruction bit space requirement.
  • the expression of the invention as set out above is made in terms of a shift-and-insert instruction yielding a result having a result value given by a specified shift operation and a specified insert operation.
  • the actual mechanisms employed to achieve a result value the same as if such a shift and insert had been performed can vary.
  • Such variant mechanisms and steps are encompassed by the present technique.
  • the shifting and inserting steps are one way of expressing how the desired end result is related to the inputs but the same relationship between inputs and outputs may be achieved and expressed in a variety of different ways. These alternatives are encompassed within the present technique.
  • the shift amount could be specified as a value stored within a register specified within the instruction or alternatively and preferably as an immediate value encoded within the shift-and-insert instruction itself.
  • the first and second registers are advantageously specified by a source register specifying fields, both relative to the registers of a register bank.
  • a destination register specifying field (optionally shared with one of the first register or second register) may also be used.
  • the shifting of the present technique may be either right shifting or left shifting depending upon the circumstances and the desired form of packing or bit insertion. It is possible that the first data value and the second data value could have different bit lengths and be stored in registers of different lengths although in preferred embodiments the first data value and the second data value have the same number of bits.
  • the relationship between the inputs and the outputs as set out above may be implemented in a variety of different ways although a preferred way is to use a shift of the first value and to form a mask value for selecting which bits within the second data value are replaced by corresponding bits within the shifted data value and which bits within the second data value are unaltered.
  • This mask value can advantageously be formed by a shift upon a starting mask or by alternative techniques such as a decode of the instruction directly forming the mask.
  • the present technique may be used with advantage both within scalar processing systems and single instruction multiple data packets (SIMD) processing systems.
  • SIMD single instruction multiple data packets
  • a method of processing data comprising the steps of: storing data values to be manipulated within a plurality of registers; performing a data processing operation using processing logic upon one or more data values stored within said plurality of registers; and in response to a program instruction, using an instruction decoder to control said processing logic to perform a data processing operation specified by said program instruction; wherein said instruction decoder is responsive to a shift-and-insert instruction to control said processing logic to perform a shift-and-insert data processing operation yielding a result having a result value given by: shifting a first data value stored within a first register by a shift amount of N bit positions to form a shifted value including N shifted-in bits, where N has one of a plurality of different non-zero values; and inserting respective bits of said shifted value, other than said N shifted-in bits, into corresponding bit positions in a second data value stored within a second register with bits within said second data value corresponding to said N
  • Figure 1 schematically illustrates a data processing system of the type which may utilise the present techniques
  • Figure 2 schematically illustrates the syntax of three different shift-and-insert instructions in accordance with one example of the present technique
  • Figure 3 schematically illustrates the action of a shift-and-insert instruction
  • Figure 4 schematically illustrates a hardware arrangement for performing a shift- and-insert operation
  • Figure 5 illustrates an example of a pixel value packing operation within a scalar processing system
  • Figure 6 schematically illustrates a pixel packing operation within a single instruction multiple data system.
  • Figure 1 schematically illustrates a data processing system 2 which may be in the form of an integrated circuit including a register bank 4, a multiplier 6, a shifter 8 and an adder 10.
  • the register bank 4, the multiplier 6, the shifter 8 and the adder 10 can be considered to form processing logic for executing desired data processing operations under control of control signals generated by an instruction decoder 12.
  • the instruction decoder 12 is itself responsive to program instructions loaded into an instruction pipeline 14. It will be appreciated that the data processing system of Figure 2 will typically contain many more elements, but these have been omitted for the sake of clarity.
  • program instructions are fetched into the instruction pipeline 14 and when they reach an execute stage within the instruction pipeline 14 are used by the instruction decoder 12 to generate control signals which configure the various elements of the processing logic 4, 6, 8, 10 to execute the desired data processing operation.
  • the processing logic will typically include many further elements for providing processing operations other than the simple multiply, shift and add operations illustrated in Figure 1.
  • FIG. 2 schematically illustrates the syntax of some example shift-and-insert instructions which may be supported by the data processing system 2 of Figure 1.
  • a shift left and insert instruction SLI includes a register field specifying a destination register dest, a register field specifying a source register src and a field specifying an immediate value #imm.
  • the source register contains the data value which is to be shifted and then inserted into the destination register with some of the bits within the destination register being unaltered.
  • the immediate value #imm specifies the amount of shift applied to the source register value before the insertion takes place and also effectively the position at which the insertion takes place as will be described further below.
  • Figure 2 illustrates a right shifting variant of the above instruction, namely an SRI instruction. There is also illustrated a variant of the left shifted instruction in which the shift value is specified by a second source register src2. It will be appreciated that the syntax and exact form of the instructions as illustrated in Figure 1 is only one example and different embodiments of the present technique may use instruction representations and syntaxes which vary considerably.
  • Figure 3 illustrates one example of a shift and insert operation.
  • a register 16 contains a source value. This source value contains a data portion 18, such as a pixel value. The portion of the register 16 outside of the data portion 18 may represent nothing meaningful or may be a fractional part of the data value which it is desired to discard.
  • the value within the register 16 is in this example subject to a right shift by an amount specified by the immediate field #imm within the SRI instruction concerned.
  • shifted-in bits are introduced into the shifted value which is generated. This is normal shift operation behavior.
  • the destination value is held within a register 20 and a portion of the shifted value other than the shifted-in bits is written into this destination value replacing the corresponding bits initially stored within the destination value.
  • the bits within the destination value 20 correspond to the shifted-in bits within the shifted value are not replaced and are left unaltered.
  • the final result value contains the original destination value with the inserted bits from the shifted value replacing its original bits at those corresponding positions.
  • bits from the shifted value other than only the data portion 18 have been inserted within the result value, namely bits G and H. If significant, then these unwanted bits can be overwritten by further bit values in a subsequence shift-and-insert operation, as desired.
  • Figure 4 schematically illustrates a hardware representation as to how the shift-and- insert instruction can be implemented.
  • the source register 22 supplies its value to, in this example, a left shifter 24. It will be appreciated that for the right shifting variants of the instruction right shifting circuits will be used instead.
  • the shift amount is specified in this example by an immediate value of #4.
  • the left shifter 24 produces a shifted value 26 with four shifted-in zero values in its four right most bit positions.
  • a mask value is produced by taking a starting mask value 28 containing all ones and subjecting this to the same shift as is being applied to the source value 22 with its own left shifter 30.
  • the shifted-in values into the mask value are again zeros and this results in a shifted mask 32.
  • the shifted mask 32 can then be used as a multi-bit control signal supplied to a multi-bit multiplexer 34 which selects either bits from the shifted value 26 or bits from the destination value 36 to feed to a result value 38.
  • Figure 5 illustrates a scalar packing operation of red, green and blue pixel value components into a 16 bit result value. The first operation executes upon the red and green component values using a right shift-and-insert instruction with a shift amount of five bit positions. This leaves the 5-bit red component R 5 unaltered within the result but writes in the green component G 6 as well as its remainder into that result.
  • the second instruction takes the combined red and green component and inserts into it the blue component B 5 by performing a right shift-and-insert instruction with a shift amount of 11 bit positions such that the blue component B 5 abuts the end of the already inserted green component G 6 and fills the remaining positions within the 16 bit result value.
  • Figure 6 illustrates the same type of packing operation as illustrated in Figure 5 but in this case performed within a single instruction multiple data (SMID) system.
  • SMID single instruction multiple data
  • the same shift-and-insert operation is separately performed upon each data lane within the SIMD system to enable four sets of pixel values to be packed together in parallel using two SIMD right shift-and-insert instructions.

Abstract

A data processing system (2) is provided which supports shift-and-insert instructions SLI, SRI which serve to shift a source data value by a specified shift amount and then insert bits from that shifted value other than the shifted-in bits into a destination value with the remaining bits within that destination value being unaltered.

Description

INSERTING BITS WITHIN A DATA WORD
This invention relates to data processing systems. More particularly, this invention relates to the insertion of bits within a data word under program control.
It is known within data processing systems to pack together a plurality of fields of bits within a single data word. As an example, within a 16 bit data word it may be desired to pack three colour component values respectively representing red, green and blue values, two of which are of 5 bits in length and one of which is 6 bits in length. It is often the case that these different component values will be separately processed and their magnitudes separately calculated. After such calculations the separate components require assembling together within a single data word such that they may be stored in a more compact form and more readily manipulated on a pixel-by-pixel basis. In order to achieve such data packing, one possible solution is to provide program instructions which specify both a length of a bit field within a source register which is to be inserted into a destination register and the position within the destination register at which that bit field is to be inserted. Such an instruction will typically have to specify the source register, the destination register, the bit field length and the bit field insertion position. Having to specify four separate parameters within a single instruction in this way places a disadvantageously high demand upon the instruction bit space available within the instruction and makes such instructions disadvantageous in terms of the instruction bit space they consume. Viewed from one aspect the present invention provides apparatus for processing data, said apparatus comprising: a plurality of registers operable to store data values to be manipulated; processing logic operable to perform a data processing operation upon one or more data values stored within said plurality of registers; and an instruction decoder responsive to a program instruction to control said processing logic to perform a data processing operation specified by said program instruction; wherein said instruction decoder is responsive to a shift-and-insert instruction to control said processing logic to perform a shift-and-insert data processing operation yielding a result having a result value given by: shifting a first data value stored within a first register by a shift amount of N bit positions to form a shifted value including N shifted-in bits, where N has one of a plurality of different non-zero values; and inserting respective bits of said shifted value, other than said N shifted-in bits, into corresponding bit positions in a second data value stored within a second register with bits within said second data value corresponding to said N shifted-in bits being unaltered thereby forming said result.
The present invention recognises that a large proportion of the cases in which it is desired to use such instructions the full flexibility of being able to separately specify the bit field length and the bit field position is not required. Instead, a single parameter specifying the amount of shift to be applied to the source value controls the starting position at which bits of the source value are written into the destination value. The bits written into the destination value extend from the starting position to the appropriate end of the destination value depending upon the shift direction of the instruction concerned. It may be that a greater number of bits are inserted than are ultimately required. However, the present technique recognises that in a large proportion of cases multiple such instructions are executed and the excess bits written in one instruction will be over-written with the desired data in a following instruction such that the final packed data value will contain the correct bits as desired. Thus, the packing type of operations, or other bit assembly operations, can be achieved with instructions having an advantageously small instruction bit space requirement. It will be appreciated that the expression of the invention as set out above is made in terms of a shift-and-insert instruction yielding a result having a result value given by a specified shift operation and a specified insert operation. It will be appreciated that the actual mechanisms employed to achieve a result value the same as if such a shift and insert had been performed can vary. Such variant mechanisms and steps are encompassed by the present technique. The shifting and inserting steps are one way of expressing how the desired end result is related to the inputs but the same relationship between inputs and outputs may be achieved and expressed in a variety of different ways. These alternatives are encompassed within the present technique. The shift amount could be specified as a value stored within a register specified within the instruction or alternatively and preferably as an immediate value encoded within the shift-and-insert instruction itself.
The first and second registers are advantageously specified by a source register specifying fields, both relative to the registers of a register bank. A destination register specifying field (optionally shared with one of the first register or second register) may also be used.
It will be appreciated that the shifting of the present technique may be either right shifting or left shifting depending upon the circumstances and the desired form of packing or bit insertion. It is possible that the first data value and the second data value could have different bit lengths and be stored in registers of different lengths although in preferred embodiments the first data value and the second data value have the same number of bits.
As previously mentioned it will be appreciated that the relationship between the inputs and the outputs as set out above may be implemented in a variety of different ways although a preferred way is to use a shift of the first value and to form a mask value for selecting which bits within the second data value are replaced by corresponding bits within the shifted data value and which bits within the second data value are unaltered. This mask value can advantageously be formed by a shift upon a starting mask or by alternative techniques such as a decode of the instruction directly forming the mask.
The present technique may be used with advantage both within scalar processing systems and single instruction multiple data packets (SIMD) processing systems.
Viewed from another aspect the present invention a method of processing data, said method comprising the steps of: storing data values to be manipulated within a plurality of registers; performing a data processing operation using processing logic upon one or more data values stored within said plurality of registers; and in response to a program instruction, using an instruction decoder to control said processing logic to perform a data processing operation specified by said program instruction; wherein said instruction decoder is responsive to a shift-and-insert instruction to control said processing logic to perform a shift-and-insert data processing operation yielding a result having a result value given by: shifting a first data value stored within a first register by a shift amount of N bit positions to form a shifted value including N shifted-in bits, where N has one of a plurality of different non-zero values; and inserting respective bits of said shifted value, other than said N shifted-in bits, into corresponding bit positions in a second data value stored within a second register with bits within said second data value corresponding to said N shifted-in bits being unaltered thereby forming said result.
Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings in which: Figure 1 schematically illustrates a data processing system of the type which may utilise the present techniques;
Figure 2 schematically illustrates the syntax of three different shift-and-insert instructions in accordance with one example of the present technique;
Figure 3 schematically illustrates the action of a shift-and-insert instruction;
Figure 4 schematically illustrates a hardware arrangement for performing a shift- and-insert operation;
Figure 5 illustrates an example of a pixel value packing operation within a scalar processing system; and Figure 6 schematically illustrates a pixel packing operation within a single instruction multiple data system.
Figure 1 schematically illustrates a data processing system 2 which may be in the form of an integrated circuit including a register bank 4, a multiplier 6, a shifter 8 and an adder 10. The register bank 4, the multiplier 6, the shifter 8 and the adder 10 can be considered to form processing logic for executing desired data processing operations under control of control signals generated by an instruction decoder 12. The instruction decoder 12 is itself responsive to program instructions loaded into an instruction pipeline 14. It will be appreciated that the data processing system of Figure 2 will typically contain many more elements, but these have been omitted for the sake of clarity. In operation, program instructions are fetched into the instruction pipeline 14 and when they reach an execute stage within the instruction pipeline 14 are used by the instruction decoder 12 to generate control signals which configure the various elements of the processing logic 4, 6, 8, 10 to execute the desired data processing operation. The processing logic will typically include many further elements for providing processing operations other than the simple multiply, shift and add operations illustrated in Figure 1.
Figure 2 schematically illustrates the syntax of some example shift-and-insert instructions which may be supported by the data processing system 2 of Figure 1. A shift left and insert instruction SLI includes a register field specifying a destination register dest, a register field specifying a source register src and a field specifying an immediate value #imm. The source register contains the data value which is to be shifted and then inserted into the destination register with some of the bits within the destination register being unaltered. The immediate value #imm specifies the amount of shift applied to the source register value before the insertion takes place and also effectively the position at which the insertion takes place as will be described further below.
Figure 2 illustrates a right shifting variant of the above instruction, namely an SRI instruction. There is also illustrated a variant of the left shifted instruction in which the shift value is specified by a second source register src2. It will be appreciated that the syntax and exact form of the instructions as illustrated in Figure 1 is only one example and different embodiments of the present technique may use instruction representations and syntaxes which vary considerably. Figure 3 illustrates one example of a shift and insert operation. A register 16 contains a source value. This source value contains a data portion 18, such as a pixel value. The portion of the register 16 outside of the data portion 18 may represent nothing meaningful or may be a fractional part of the data value which it is desired to discard. The value within the register 16 is in this example subject to a right shift by an amount specified by the immediate field #imm within the SRI instruction concerned. At the left hand end of the register, shifted-in bits are introduced into the shifted value which is generated. This is normal shift operation behavior. The destination value is held within a register 20 and a portion of the shifted value other than the shifted-in bits is written into this destination value replacing the corresponding bits initially stored within the destination value. The bits within the destination value 20 correspond to the shifted-in bits within the shifted value are not replaced and are left unaltered. The final result value contains the original destination value with the inserted bits from the shifted value replacing its original bits at those corresponding positions. It will be seen in the current example that bits from the shifted value other than only the data portion 18 have been inserted within the result value, namely bits G and H. If significant, then these unwanted bits can be overwritten by further bit values in a subsequence shift-and-insert operation, as desired.
Figure 4 schematically illustrates a hardware representation as to how the shift-and- insert instruction can be implemented. The source register 22 supplies its value to, in this example, a left shifter 24. It will be appreciated that for the right shifting variants of the instruction right shifting circuits will be used instead. The shift amount is specified in this example by an immediate value of #4. The left shifter 24 produces a shifted value 26 with four shifted-in zero values in its four right most bit positions. In parallel with the generation of the shifted value 26, a mask value is produced by taking a starting mask value 28 containing all ones and subjecting this to the same shift as is being applied to the source value 22 with its own left shifter 30. The shifted-in values into the mask value are again zeros and this results in a shifted mask 32. The shifted mask 32 can then be used as a multi-bit control signal supplied to a multi-bit multiplexer 34 which selects either bits from the shifted value 26 or bits from the destination value 36 to feed to a result value 38. Figure 5 illustrates a scalar packing operation of red, green and blue pixel value components into a 16 bit result value. The first operation executes upon the red and green component values using a right shift-and-insert instruction with a shift amount of five bit positions. This leaves the 5-bit red component R5 unaltered within the result but writes in the green component G6 as well as its remainder into that result. The second instruction takes the combined red and green component and inserts into it the blue component B5 by performing a right shift-and-insert instruction with a shift amount of 11 bit positions such that the blue component B5 abuts the end of the already inserted green component G6 and fills the remaining positions within the 16 bit result value. Figure 6 illustrates the same type of packing operation as illustrated in Figure 5 but in this case performed within a single instruction multiple data (SMID) system. As will be appreciated the same shift-and-insert operation is separately performed upon each data lane within the SIMD system to enable four sets of pixel values to be packed together in parallel using two SIMD right shift-and-insert instructions.

Claims

1. Apparatus for processing data, said apparatus comprising: a plurality of registers operable to store data values to be manipulated; processing logic operable to perform a data processing operation upon one or more data values stored within said plurality of registers; and an instruction decoder responsive to a program instruction to control said processing logic to perform a data processing operation specified by said program instruction; wherein said instruction decoder is responsive to a shift-and-insert instruction to control said processing logic to perform a shift-and-insert data processing operation yielding a result having a result value given by: shifting a first data value stored within a first register by a shift amount of N bit positions to form a shifted value including N shifted-in bits, where N has one of a plurality of different non-zero values; and inserting respective bits of said shifted value, other than said N shifted-in bits, into corresponding bit positions in a second data value stored within a second register with bits within said second data value corresponding to said N shifted-in bits being unaltered thereby forming said result.
2. Apparatus as claimed in claim 1, wherein said shift-and-insert instruction includes an immediate value specifying said shift amount of N bit positions.
3. Apparatus as claimed in any one of claims 1 and 2, wherein said shift-and-insert instruction includes a register specifying field specifying a register within a register bank to be used as said first register.
4. Apparatus as claimed in any one of claims 1, 2 and 3, wherein said shift-and-insert instruction includes a register specifying field specifying a register within a register bank to be used as said second register.
5. Apparatus as claimed in any one of the preceding claims, wherein a shift-and-insert instruction includes a register specifying field specifying a register within a register bank to be used as a destination register.
6. Apparatus as claimed in claim 5, wherein said register specifying field for said destination register is shared with one of said first register and second register.
7. Apparatus as claimed in any one of the preceding claims, wherein said first data value is right shifted.
8. Apparatus as claimed in any one of claims 1 to 6, wherein said first data value is left shifted.
9. Apparatus as claimed in any one of the preceding claims, wherein said first data value and said second data value have the same number of bits.
10. Apparatus as claimed in any one of the preceding claims, wherein, in response to said shift-and-insert instruction, said processing logic is operable to shift said first value.
11. Apparatus as claimed in any one of the preceding claims, wherein, in response to said shift-and-insert instruction, said processing logic is operable to form a mask value for selecting which bits within said second data value are replaced by corresponding bits within said shifted data value and which bits within said second data value are unaltered.
12. Apparatus as claimed in any one of the preceding claims, wherein said processing logic is single instruction multiple data processing logic and said first register and said second register are respective portions of a first single instruction multiple data register and a second single instruction multiple data register, said shift-and-insert instruction being operable to control execution in parallel of a plurality of shift-and-insert operations in respective processing lanes.
13. Apparatus as claimed in any one of claims 1 to 11, wherein said processing logic is scalar processing logic.
14. A method of processing data, said method comprising the steps of: storing data values to be manipulated within a plurality of registers; performing a data processing operation using processing logic upon one or more data values stored within said plurality of registers; and in response to a program instruction, using an instruction decoder to control said processing logic to perform a data processing operation specified by said program instruction; wherein said instruction decoder is responsive to a shift-and-insert instruction to control said processing logic to perform a shift-and-insert data processing operation yielding a result having a result value given by: shifting a first data value stored within a first register by a shift amount of N bit positions to form a shifted value including N shifted-in bits, where N has one of a plurality of different non-zero values; and inserting respective bits of said shifted value, other than said N shifted-in bits, into corresponding bit positions in a second data value stored within a second register with bits within said second data value corresponding to said N shifted-in bits being unaltered thereby forming said result.
15. A method as claimed in claim 14, wherein said shift-and-insert instruction includes an immediate value specifying said shift amount of N bit positions.
16. A method as claimed in any one of claims 14 and 15, wherein said shift-and-insert instruction includes a register specifying field specifying a register within a register bank to be used as said first register.
17. A method as claimed in any one of claims 14, 15 and 16, wherein shift-and-insert instruction includes a register specifying field specifying a register within a register bank to be used as said second register.
18. A method as claimed in any one of claims 14 to 17, wherein shift-and-insert instruction includes a register specifying field specifying a register within a register bank to be used as a destination register.
19. A method as claimed in claim 18, wherein said register specifying field for said destination register is shared with one of said first register and second register.
20. A method as claimed in any one of claims 13 to 17, wherein said first data value is right shifted.
21. A method as claimed in any one of claims 14 to 17, wherein said first data value is left shifted.
22. A method as claimed in any one of claims 14 to 19, wherein said first data value and said second data value have the same number of bits.
23. A method as claimed in any one of claims 14 to 20, wherein, in response to said shift-and-insert instruction, said processing logic is operable to shift said first value.
24. A method as claimed in any one of claims 14 to 21, wherein, in response to said shift-and-insert instruction, said processing logic is operable to form a mask value for selecting which bits within said second data value are replaced by corresponding bits within said shifted data value and which bits within said second data value are unaltered.
25. A method as claimed in any one of claims 14 to 22, wherein said processing logic is single instruction multiple data processing logic and said first register and said second register are respective portions of a first single instruction multiple data register and a second single instruction multiple data register, said shift-and-insert instruction being operable to control execution in parallel of a plurality of shift-and-insert operations in respective processing lanes.
26. A method as claimed in any one of claims 14 to 22, wherein said processing logic is scalar processing logic.
PCT/GB2004/003343 2004-03-10 2004-08-03 Inserting bits within a data word WO2005088441A2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP04743646A EP1723512A2 (en) 2004-03-10 2004-08-03 Inserting bits within a data word
JP2007502375A JP2007528545A (en) 2004-03-10 2004-08-03 Apparatus and method for inserting bits into a data word
IL177507A IL177507A (en) 2004-03-10 2006-08-15 Inserting bits within a data word

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0405407A GB2411978B (en) 2004-03-10 2004-03-10 Inserting bits within a data word
GB0405407.8 2004-03-10

Publications (2)

Publication Number Publication Date
WO2005088441A2 true WO2005088441A2 (en) 2005-09-22
WO2005088441A3 WO2005088441A3 (en) 2006-06-22

Family

ID=32117417

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2004/003343 WO2005088441A2 (en) 2004-03-10 2004-08-03 Inserting bits within a data word

Country Status (11)

Country Link
US (1) US7350058B2 (en)
EP (1) EP1723512A2 (en)
JP (1) JP2007528545A (en)
KR (1) KR100981998B1 (en)
CN (1) CN100538624C (en)
GB (1) GB2411978B (en)
IL (1) IL177507A (en)
MY (1) MY137200A (en)
RU (1) RU2006135629A (en)
TW (1) TWI322947B (en)
WO (1) WO2005088441A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017021677A1 (en) * 2015-07-31 2017-02-09 Arm Limited An apparatus and method for performing a splice operation

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7529918B2 (en) * 2006-07-21 2009-05-05 Broadcom Corporation System and method for efficiently performing bit-field extraction and bit-field combination operations in a processor
JP4374363B2 (en) * 2006-09-26 2009-12-02 Okiセミコンダクタ株式会社 Bit field operation circuit
GB2475653B (en) 2007-03-12 2011-07-13 Advanced Risc Mach Ltd Select and insert instructions within data processing systems
KR20100101586A (en) * 2007-12-05 2010-09-17 샌드브리지 테크놀로지스, 인코포레이티드 Method and instruction set including register shifts and rotates for data processing
CN102348111A (en) * 2010-07-30 2012-02-08 国家卫星气象中心 Data compression structure identification code used for stationary weather satellite data broadcasting
US20120117360A1 (en) * 2010-11-09 2012-05-10 Texas Instruments Incorporated Dedicated instructions for variable length code insertion by a digital signal processor (dsp)
GB2485774A (en) 2010-11-23 2012-05-30 Advanced Risc Mach Ltd Processor instruction to extract a bit field from one operand and insert it into another with an option to sign or zero extend the field
US9823928B2 (en) * 2011-09-30 2017-11-21 Qualcomm Incorporated FIFO load instruction
CN111831334A (en) * 2011-12-23 2020-10-27 英特尔公司 Apparatus and method for improved insertion of instructions
US9411593B2 (en) 2013-03-15 2016-08-09 Intel Corporation Processors, methods, systems, and instructions to consolidate unmasked elements of operation masks
US10402198B2 (en) * 2013-06-18 2019-09-03 Nxp Usa, Inc. Signal processing device and method of performing a pack-insert operation
CN104899522B (en) * 2015-06-09 2018-01-30 网易(杭州)网络有限公司 A kind of data processing method and device
US20170177350A1 (en) 2015-12-18 2017-06-22 Intel Corporation Instructions and Logic for Set-Multiple-Vector-Elements Operations
US20170185402A1 (en) * 2015-12-23 2017-06-29 Intel Corporation Instructions and logic for bit field address and insertion
CN105892993B (en) * 2016-03-28 2019-02-15 龙芯中科技术有限公司 Based on recombination method, device and the microprocessor for extracting insertion operation
CN110912562A (en) * 2018-09-18 2020-03-24 深圳市茁壮网络股份有限公司 Floating point data processing method and device and storage medium
CN109891756B (en) * 2019-01-31 2023-03-28 香港应用科技研究院有限公司 Resettable segmented scalable shifter
US10831479B2 (en) 2019-02-20 2020-11-10 International Business Machines Corporation Instruction to move data in a right-to-left direction

Family Cites Families (58)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US133682A (en) * 1872-12-03 Improvement in gates
US131030A (en) * 1872-09-03 Improvement in portable furnaces
FR2253415A5 (en) * 1973-12-04 1975-06-27 Cii
US4569016A (en) * 1983-06-30 1986-02-04 International Business Machines Corporation Mechanism for implementing one machine cycle executable mask and rotate instructions in a primitive instruction set computing system
US4876660A (en) 1987-03-20 1989-10-24 Bipolar Integrated Technology, Inc. Fixed-point multiplier-accumulator architecture
JPH0778735B2 (en) 1988-12-05 1995-08-23 松下電器産業株式会社 Cache device and instruction read device
JPH05233281A (en) 1992-02-21 1993-09-10 Toshiba Corp Electronic computer
US5408670A (en) 1992-12-18 1995-04-18 Xerox Corporation Performing arithmetic in parallel on composite operands with packed multi-bit components
US5481743A (en) 1993-09-30 1996-01-02 Apple Computer, Inc. Minimal instruction set computer architecture and multiple instruction issue method
US5881302A (en) 1994-05-31 1999-03-09 Nec Corporation Vector processing unit with reconfigurable data buffer
GB9412434D0 (en) 1994-06-21 1994-08-10 Inmos Ltd Computer instruction compression
US6009508A (en) 1994-06-21 1999-12-28 Sgs-Thomson Microelectronics Limited System and method for addressing plurality of data values with a single address in a multi-value store on FIFO basis
GB9412487D0 (en) 1994-06-22 1994-08-10 Inmos Ltd A computer system for executing branch instructions
US5761103A (en) 1995-03-08 1998-06-02 Texas Instruments Incorporated Left and right justification of single precision mantissa in a double precision rounding unit
GB9509983D0 (en) 1995-05-17 1995-07-12 Sgs Thomson Microelectronics Replication of data
GB9509989D0 (en) 1995-05-17 1995-07-12 Sgs Thomson Microelectronics Manipulation of data
GB9509988D0 (en) 1995-05-17 1995-07-12 Sgs Thomson Microelectronics Matrix transposition
GB9509987D0 (en) 1995-05-17 1995-07-12 Sgs Thomson Microelectronics Manipulation of data
GB9513515D0 (en) 1995-07-03 1995-09-06 Sgs Thomson Microelectronics Expansion of data
GB9514684D0 (en) 1995-07-18 1995-09-13 Sgs Thomson Microelectronics An arithmetic unit
GB9514695D0 (en) 1995-07-18 1995-09-13 Sgs Thomson Microelectronics Combining data values
JP3526976B2 (en) 1995-08-03 2004-05-17 株式会社日立製作所 Processor and data processing device
US6295599B1 (en) 1995-08-16 2001-09-25 Microunity Systems Engineering System and method for providing a wide operand architecture
US5907865A (en) 1995-08-28 1999-05-25 Motorola, Inc. Method and data processing system for dynamically accessing both big-endian and little-endian storage schemes
US5963744A (en) 1995-09-01 1999-10-05 Philips Electronics North America Corporation Method and apparatus for custom operations of a processor
US6088783A (en) 1996-02-16 2000-07-11 Morton; Steven G DPS having a plurality of like processors controlled in parallel by an instruction word, and a control processor also controlled by the instruction word
US5937178A (en) 1996-02-13 1999-08-10 National Semiconductor Corporation Register file for registers with multiple addressable sizes using read-modify-write for register file update
US6009191A (en) * 1996-02-15 1999-12-28 Intel Corporation Computer implemented method for compressing 48-bit pixels to 16-bit pixels
US5808875A (en) 1996-03-29 1998-09-15 Intel Corporation Integrated circuit solder-rack interconnect module
US5838984A (en) 1996-08-19 1998-11-17 Samsung Electronics Co., Ltd. Single-instruction-multiple-data processing using multiple banks of vector registers
US6058465A (en) 1996-08-19 2000-05-02 Nguyen; Le Trong Single-instruction-multiple-data processing in a multimedia signal processor
US5996066A (en) * 1996-10-10 1999-11-30 Sun Microsystems, Inc. Partitioned multiply and add/subtract instruction for CPU with integrated graphics functions
US5893145A (en) 1996-12-02 1999-04-06 Compaq Computer Corp. System and method for routing operands within partitions of a source register to partitions within a destination register
US5909572A (en) 1996-12-02 1999-06-01 Compaq Computer Corp. System and method for conditionally moving an operand from a source register to a destination register
US6173366B1 (en) 1996-12-02 2001-01-09 Compaq Computer Corp. Load and store instructions which perform unpacking and packing of data bits in separate vector and integer cache storage
US5898896A (en) 1997-04-10 1999-04-27 International Business Machines Corporation Method and apparatus for data ordering of I/O transfers in Bi-modal Endian PowerPC systems
US5973705A (en) 1997-04-24 1999-10-26 International Business Machines Corporation Geometry pipeline implemented on a SIMD machine
US6047304A (en) 1997-07-29 2000-04-04 Nortel Networks Corporation Method and apparatus for performing lane arithmetic to perform network processing
GB2330226B (en) 1997-08-30 2000-12-27 Lg Electronics Inc Digital signal processor
GB2329810B (en) 1997-09-29 2002-02-27 Science Res Foundation Generation and use of compressed image data
US5933650A (en) 1997-10-09 1999-08-03 Mips Technologies, Inc. Alignment and ordering of vector elements for single instruction multiple data processing
US5864703A (en) 1997-10-09 1999-01-26 Mips Technologies, Inc. Method for providing extended precision in SIMD vector arithmetic operations
US6085213A (en) 1997-10-23 2000-07-04 Advanced Micro Devices, Inc. Method and apparatus for simultaneously multiplying two or more independent pairs of operands and summing the products
US6038583A (en) 1997-10-23 2000-03-14 Advanced Micro Devices, Inc. Method and apparatus for simultaneously multiplying two or more independent pairs of operands and calculating a rounded products
US6223198B1 (en) 1998-08-14 2001-04-24 Advanced Micro Devices, Inc. Method and apparatus for multi-function arithmetic
US6144980A (en) 1998-01-28 2000-11-07 Advanced Micro Devices, Inc. Method and apparatus for performing multiple types of multiplication including signed and unsigned multiplication
US6269384B1 (en) 1998-03-27 2001-07-31 Advanced Micro Devices, Inc. Method and apparatus for rounding and normalizing results within a multiplier
US6223277B1 (en) 1997-11-21 2001-04-24 Texas Instruments Incorporated Data processing circuit with packed data structure capability
US6223320B1 (en) 1998-02-10 2001-04-24 International Business Machines Corporation Efficient CRC generation utilizing parallel table lookup operations
US6334176B1 (en) 1998-04-17 2001-12-25 Motorola, Inc. Method and apparatus for generating an alignment control vector
US6292888B1 (en) 1999-01-27 2001-09-18 Clearwater Networks, Inc. Register transfer unit for electronic processor
GB2352065B (en) 1999-07-14 2004-03-03 Element 14 Ltd A memory access system
US6408345B1 (en) 1999-07-15 2002-06-18 Texas Instruments Incorporated Superscalar memory transfer controller in multilevel memory organization
US6546480B1 (en) 1999-10-01 2003-04-08 Hitachi, Ltd. Instructions for arithmetic operations on vectored data
US6430684B1 (en) * 1999-10-29 2002-08-06 Texas Instruments Incorporated Processor circuits, systems, and methods with efficient granularity shift and/or merge instruction(s)
US6748521B1 (en) 2000-02-18 2004-06-08 Texas Instruments Incorporated Microprocessor with instruction for saturating and packing data
US7685212B2 (en) * 2001-10-29 2010-03-23 Intel Corporation Fast full search motion estimation with SIMD merge instruction
US7272622B2 (en) * 2001-10-29 2007-09-18 Intel Corporation Method and apparatus for parallel shift right merge of data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
None
See also references of EP1723512A2

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017021677A1 (en) * 2015-07-31 2017-02-09 Arm Limited An apparatus and method for performing a splice operation
JP2018521426A (en) * 2015-07-31 2018-08-02 エイアールエム リミテッド Apparatus and method for performing splice operations
TWI716425B (en) * 2015-07-31 2021-01-21 英商Arm股份有限公司 An apparatus and method for performing a splice operation

Also Published As

Publication number Publication date
US7350058B2 (en) 2008-03-25
US20050204117A1 (en) 2005-09-15
GB2411978A (en) 2005-09-14
EP1723512A2 (en) 2006-11-22
TW200530838A (en) 2005-09-16
MY137200A (en) 2009-01-30
KR20070028322A (en) 2007-03-12
GB0405407D0 (en) 2004-04-21
CN100538624C (en) 2009-09-09
TWI322947B (en) 2010-04-01
KR100981998B1 (en) 2010-09-13
IL177507A (en) 2010-12-30
WO2005088441A3 (en) 2006-06-22
GB2411978B (en) 2007-04-04
IL177507A0 (en) 2006-12-10
CN1926511A (en) 2007-03-07
JP2007528545A (en) 2007-10-11
RU2006135629A (en) 2008-04-20

Similar Documents

Publication Publication Date Title
US7350058B2 (en) Shift and insert instruction for overwriting a subset of data within a register with a shifted result of another register
JP5047944B2 (en) Data access and replacement unit
US6438676B1 (en) Distance controlled concatenation of selected portions of elements of packed data
KR100455011B1 (en) Processor which can favorably execute a rounding process composed of positive conversion and saturated calculation processing
US20030200237A1 (en) Serial operation pipeline, arithmetic device, arithmetic-logic circuit and operation method using the serial operation pipeline
US9292298B2 (en) Data processing apparatus having SIMD processing circuitry
US20230325189A1 (en) Forming Constant Extensions in the Same Execute Packet in a VLIW Processor
IL169374A (en) Result partitioning within simd data processing systems
KR20100108509A (en) Method of encoding register instruction fields
JP5346467B2 (en) Data processing circuit in which functional units share a read port
EP1323031A1 (en) Single instruction multiple data processing
JP2008108220A (en) Arithmetic unit
JP5853177B2 (en) Data processing apparatus and data processing method
US7761695B2 (en) Programmable data processor for a variable length encoder/decoder
US20100161944A1 (en) Processor and instruction control method
US20040024992A1 (en) Decoding method for a multi-length-mode instruction set
GB2564853A (en) Vector interleaving in a data processing apparatus
CN113841134A (en) Processing device with vector transformation execution
EP1761845B1 (en) Bit-plane extraction operation
GB2564696A (en) Register-based complex number processing
US20060271610A1 (en) Digital signal processor having reconfigurable data paths
KR101149883B1 (en) Data processing apparatus

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200480042344.3

Country of ref document: CN

AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2004743646

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 177507

Country of ref document: IL

WWE Wipo information: entry into national phase

Ref document number: 4838/DELNP/2006

Country of ref document: IN

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
WWE Wipo information: entry into national phase

Ref document number: 1020067018123

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 2007502375

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Ref document number: DE

WWE Wipo information: entry into national phase

Ref document number: 2006135629

Country of ref document: RU

WWP Wipo information: published in national office

Ref document number: 2004743646

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1020067018123

Country of ref document: KR