radeonsi: implement TGSI_OPCODE_BFI (v2)

v2: Don't use the intrinsics, the shader backend can recognize these
    patterns and generates optimal code automatically.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
diff --git a/docs/GL3.txt b/docs/GL3.txt
index 267740a..b295149 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -102,7 +102,7 @@
   - Dynamically uniform UBO array indices              DONE (r600)
   - Implicit signed -> unsigned conversions            DONE
   - Fused multiply-add                                 DONE ()
-  - Packing/bitfield/conversion functions              DONE (r600)
+  - Packing/bitfield/conversion functions              DONE (r600, radeonsi)
   - Enhanced textureGather                             DONE (r600, radeonsi)
   - Geometry shader instancing                         DONE (r600)
   - Geometry shader multiple streams                   DONE ()