x86: use GFP_DMA for 24bit coherent_dma_mask

dma_alloc_coherent (include/asm-x86/dma-mapping.h) avoids GFP_DMA
allocation first and if the allocated address is not fit for the
device's coherent_dma_mask, then dma_alloc_coherent does GFP_DMA
allocation. This is because dma_alloc_coherent avoids precious GFP_DMA
zone if possible. This is also how the old dma_alloc_coherent
(arch/x86/kernel/pci-dma.c) works.

However, if the coherent_dma_mask of a device is 24bit, there is no
point to go into the above GFP_DMA retry mechanism. We had better use
GFP_DMA in the first place.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
diff --git a/arch/x86/include/asm/dma-mapping.h b/arch/x86/include/asm/dma-mapping.h
index 4a5397b..7f225a4 100644
--- a/arch/x86/include/asm/dma-mapping.h
+++ b/arch/x86/include/asm/dma-mapping.h
@@ -255,9 +255,11 @@
 
 static inline gfp_t dma_alloc_coherent_gfp_flags(struct device *dev, gfp_t gfp)
 {
-#ifdef CONFIG_X86_64
 	unsigned long dma_mask = dma_alloc_coherent_mask(dev, gfp);
 
+	if (dma_mask <= DMA_24BIT_MASK)
+		gfp |= GFP_DMA;
+#ifdef CONFIG_X86_64
 	if (dma_mask <= DMA_32BIT_MASK && !(gfp & GFP_DMA))
 		gfp |= GFP_DMA32;
 #endif